oracle 替换 SQL 结果中的字符

Question

提问by FashionHouseJewelry.com

This is from a Oracle SQL query. It has these weird skinny rectangle shapes in the database in places where apostrophes should be. (I wish we would could paste screen shots in here)

这是来自 Oracle SQL 查询。它在数据库中应该有撇号的地方有这些奇怪的细长矩形形状。（我希望我们能在这里粘贴屏幕截图）

It looks like this when I copy and paste the results.

当我复制和粘贴结果时，它看起来像这样。

spouse?s

is there a way to write a SQL SELECT statement that searches for this character in the field and replaces it with an apostrophe in the results?

有没有办法编写一个 SQL SELECT 语句，在字段中搜索这个字符并在结果中用撇号替换它？

Edit: I need to change only the results in a SELECT statement for reporting purposes, I can't change the Database.

编辑：为了报告目的，我只需要更改 SELECT 语句中的结果，我无法更改数据库。

I ran this

我跑了这个

select dump('?') from dual;

which returned

返回

Typ=96 Len=3: 239,191,189

典型值=96 Len=3：239,191,189

This seems to work so far

到目前为止这似乎有效

select translate('What is your spouse?s first name?', '?', '''') from dual;

select translate('你配偶的名字是什么？', '?', '''') from dual;

but this doesn't work

但这不起作用

select translate(Fieldname, '?', '''') from TableName

Select FN from TN

从 TN 中选择 FN

What is your spouse?s first name?

你配偶的名字是什么？

SELECT DUMP(FN, 1016) from TN

从 TN 选择转储（FN，1016）

Typ=1 Len=33 CharacterSet=US7ASCII: 57,68,61,74,20,69,73,20,79,6f,75,72,20,73,70,6f,75,73,65,92,73,20,66,69,72,73,74,20,6e,61,6d,65,3f

Typ=1 Len=33 CharacterSet=US7ASCII: 57,68,61,74,20,69,73,20,79,6f,75,72,20,73,70,6f,75,73,65, 92, 73,20,66,69,72,73,74,20,6e,61,6d,65,3f

EDIT: So I have established that is the backquote character. I can't get the DB updated so I'm trying this code

编辑：所以我已经确定这是反引号字符。我无法更新数据库，所以我正在尝试此代码

SELECT REGEX_REPLACE(FN,"\0092","\0027") FROM TN

SELECT REGEX_REPLACE(FN,"\0092","\0027") 从 TN

and I"m getting ORA-00904:"Regex_Replace":invalid identifier

我得到 ORA-00904:"Regex_Replace": 无效标识符

Answer 1

回答by FerranB

This seems a problem with your charset configuracion. Check your NLS_LANG and others NLS_xxx enviroment/regedit values. You have to check the oracle server, your client and the client of the inserter of that data.

这似乎是您的字符集配置的问题。检查您的 NLS_LANG 和其他 NLS_xxx 环境/regedit 值。您必须检查 Oracle 服务器、您的客户端和该数据插入器的客户端。

Try to DUMPthe value. you can do it with a select as simple as:

尝试转储该值。你可以用一个简单的选择来做到这一点：

SELECT DUMP(the_column)
  FROM xxx
 WHERE xxx

UPDATE: I think that before try to replace, look for the root of the problem. If this happens because a charset trouble you can get big problems with bad data.

更新：我认为在尝试更换之前，先寻找问题的根源。如果发生这种情况是因为字符集问题，您可能会因错误数据而出现大问题。

UPDATE 2: Answering the comments. The problem may be is not on the database server side, may be is in the client side. The problem (if this is the problem) can be a translation on server to/from client comunication. It's for a server-client bad configuracion-coordination. For instance if the server has defined UTF8 charset and your client uses US7ASCII, then all acutes will appear as ?.

更新 2：回答评论。问题可能不在数据库服务器端，可能在客户端。问题（如果这是问题）可能是服务器与客户端通信的转换。这是针对服务器 - 客户端错误的配置协调。例如，如果服务器定义了 UTF8 字符集，而您的客户端使用 US7ASCII，则所有尖字符都将显示为 ?。

Another approach can be that if the server has defined UTF8 charset and your client also UTF8 but the application is not able to show UTF8 chars, then the problem is in the application side.

另一种方法是，如果服务器定义了 UTF8 字符集，而您的客户端也定义了 UTF8，但应用程序无法显示 UTF8 字符，则问题出在应用程序端。

UPDATE 3: On your examples:

更新 3：关于你的例子：

select translate('What. It works because the ? is exactly the same char: You have pasted on both sides.
select translate(Fieldname. It does not work because the ? is not stored on database, it's the char that the client receives may be because some translation occurs from the data table until it's showed to you.

select translate('What. 它有效，因为 ? 是完全相同的字符：您已在两侧粘贴。
select translate(Fieldname. 它不起作用，因为？没有存储在数据库中，它是客户端收到的字符，可能是因为数据表中发生了一些转换，直到它显示给您为止。

Next step: Look in DUMPsyntax and try to extract the codes for the mysterious char (from the table not pasting ?!).

下一步：查看DUMP语法并尝试提取神秘字符的代码（从表中不粘贴？！）。

Answer 2

回答by Adam Crume

I would say there's a good chance the character is a single-tick "smart quote" (I hate the name). The smart quotes are characters 91-94 (using a Windows encoding), or Unicode U+2018, U+2019, U+201C, and U+201D.

我会说这个角色很有可能是一个单勾“智能引语”（我讨厌这个名字）。智能引号是字符 91-94（使用 Windows 编码）或 Unicode U+2018、U+2019、U+201C 和 U+201D。

Answer 3

回答by J. Polfer

I'm going to propose a front-end application-based, client-side approach to the problem:

我将提出一种基于前端应用程序的客户端方法来解决这个问题：

I suspect that this problem has more to do with a mismatch between the font you are trying to display the word spouse?s with, and the character ?. That icon appears when you are trying to display a character in a Unicode font that doesn't have the glyph for the character's code.

我怀疑这个问题更多地与您尝试显示“配偶”一词的字体与字符 ? 之间的不匹配有关。当您尝试以没有字符代码字形的 Unicode 字体显示字符时，会出现该图标。

The Oracle database will dutifully return whatever characters were INSERTed into its' column. It's more up to you, and your application, to interpret what it will look like given the font you are trying to display your data with in your application, so I suggest investigating as to what this mysterious ? character is that is replacing your apostrophes. Start by using FerranB's recommended DUMP().

Oracle 数据库将尽职尽责地返回插入到其列中的任何字符。考虑到您尝试在应用程序中显示数据的字体，您和您的应用程序更重要的是解释它的外观，所以我建议调查一下这个神秘的东西是什么？字符是正在替换您的撇号。首先使用 FerranB 推荐的 DUMP()。

Try running the following query to get the character code:

尝试运行以下查询以获取字符代码：

SELECT DUMP(<column with weird character>, 1016) 
FROM <your table> 
WHERE <column with weird character> like '%spouse%';

If that doesn't grab your actual text from the database, you'll need to modify the WHERE clause to actually grab the offending column.

如果这不能从数据库中获取您的实际文本，则您需要修改 WHERE 子句以实际获取有问题的列。

Once you've found the code for the character, you could just replace the character by using the regex_replace()built-in function by determining the raw hex code of the character and then supplying the ASCII / C0 Controls and Basic Latin character 0x0027 ('), using code similar to this:

找到字符的代码后，您可以使用regex_replace()内置函数来替换字符，方法是确定字符的原始十六进制代码，然后提供 ASCII / C0 控件和基本拉丁字符 0x0027（ ')，使用类似的代码：

UPDATE <table>
    set <column with offending character> 
            = REGEX_REPLACE(<column with offending character>,
                            "<character code of ?>",
                            "'")
WHERE regex_like(<column with offending character>,"<character code of ?>");

If you aren't familiar with Unicode and different ways of character encoding, I recommend reading Joel's article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). I wasn't until I read that article.

如果您不熟悉 Unicode 和不同的字符编码方式，我建议您阅读 Joel 的文章每个软件开发人员绝对、肯定必须了解 Unicode 和字符集的绝对最小值（没有借口！）。直到我读了那篇文章。

EDIT:If your'e seeing 0x92, there's likely a charset mismatch here:

编辑：如果您看到0x92，则此处可能存在字符集不匹配：

0x92 in CP-1252 (default Windows code page) is a backquote character, which looks kinda like an apostrophe. This code isn't a valid ASCII character, and it isn't valid in IS0-8859-1 either. So probably either the database is in CP-1252 encoding (don't find that likely), or a database connection which spoke CP-1252 inserted it, or somehow the apostrophe got converted to 0x92. The database is returning values that are valid in CP-1252 (or some other charset where 0x92 is valid), but your db client connection isn't expecting CP-1252. Hence, the wierd question mark.

CP-1252（默认 Windows 代码页）中的 0x92 是一个反引号字符，看起来有点像撇号。此代码不是有效的 ASCII 字符，在 IS0-8859-1 中也无效。因此，可能数据库采用 CP-1252 编码（不太可能），或者是 CP-1252 插入的数据库连接，或者撇号以某种方式转换为 0x92。数据库返回的值在 CP-1252（或其他一些 0x92 有效的字符集）中有效，但您的数据库客户端连接不期望 CP-1252。因此，奇怪的问号。

And FerranB is likely right. I would talk with your DBA or some other admin about this to get the issue straightened out. If you can't, I would try either doing the update above (seems like you can't), or doing this:

FerranB 可能是对的。我会与您的 DBA 或其他管理员讨论此事以解决问题。如果你不能，我会尝试做上面的更新（似乎你不能），或者这样做：

INSERT (<normal table columns>,...,<column with offending character>) INTO <table>
SELECT <all normal columns>, REGEX_REPLACE(<column with offending character>,
                             "##代码##92",
                             "##代码##27")  -- for ASCII/ISO-8859-1 apostrophe
FROM <table>
WHERE regex_like(<column with offending character>,"##代码##92");

DELETE FROM <table> WHERE regex_like(<column with offending character>,"##代码##92");

Answer 4

回答by Stefan Arentz

Before you do this you need to understand what actually happened. It looks to me that someone inserted non-ascii strings in the database. For example Unicode or UTF-8. Before you fix this, be very sure that this is actually a bug. The apostrophe comes in many forms, not just the "'".

在执行此操作之前，您需要了解实际发生的情况。在我看来，有人在数据库中插入了非 ascii 字符串。例如 Unicode 或 UTF-8。在修复此问题之前，请确保这实际上是一个错误。撇号有多种形式，而不仅仅是“'”。

Answer 5

回答by David Aldridge

TRANSLATE() is a useful function for replacing or eliminating known single character codes.

TRANSLATE() 是替换或消除已知单字符代码的有用函数。

oracle 替换 SQL 结果中的字符

提问by FashionHouseJewelry.com

Typ=96 Len=3: 239,191,189

典型值=96 Len=3：239,191,189

回答by FerranB

回答by Adam Crume

回答by J. Polfer

回答by Stefan Arentz

回答by David Aldridge

相关推荐

最近更新

标签

oracle 替换 SQL 结果中的字符

提问by FashionHouseJewelry.com

Typ=96 Len=3: 239,191,189

典型值=96 Len=3：239,191,189

回答by FerranB

回答by Adam Crume

回答by J. Polfer

回答by Stefan Arentz

回答by David Aldridge

相关推荐

oracle 连接Oracle数据库时遇到java.sql.SQLException

Oracle 识别数据类型

来自 Oracle DB 的 SSRS 报告 - 使用存储过程

oracle 在什么条件下 ROWNUM=1 会显着提高“存在”syle 查询的性能

相关推荐

最近更新

标签