SQL SELECT WHERE 字段包含单词

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14290857/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 13:07:00  来源:igfitidea点击:

SQL SELECT WHERE field contains words

sqlselect

提问by Mario M

I need a select which would return results like this:

我需要一个选择,它会返回这样的结果:

SELECT * FROM MyTable WHERE Column1 CONTAINS 'word1 word2 word3'

And I need all results, i.e. this includes strings with 'word2 word3 word1' or 'word1 word3 word2' or any other combination of the three.

我需要所有结果,即这包括带有“word2 word3 word1”或“word1 word3 word2”或三者的任何其他组合的字符串。

All words need to be in the result.

所有的词都需要在结果中。

回答by mvp

Rather slow, but working method to include anyof words:

相当慢,但包括任何单词的工作方法:

SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
   OR column1 LIKE '%word2%'
   OR column1 LIKE '%word3%'

If you need allwords to be present, use this:

如果您需要出现所有单词,请使用以下命令:

SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
  AND column1 LIKE '%word2%'
  AND column1 LIKE '%word3%'

If you want something faster, you need to look into full text search, and this is very specific for each database type.

如果您想要更快的速度,您需要查看全文搜索,这对于每种数据库类型都是非常具体的。

回答by Sam

Note that if you use LIKEto determine if a string is a substring of another string, you must escape the pattern matching characters in your search string.

请注意,如果LIKE用于确定一个字符串是否是另一个字符串的子字符串,则必须对搜索字符串中的模式匹配字符进行转义。

If your SQL dialect supports CHARINDEX, it's a lot easier to use it instead:

如果您的 SQL 方言支持CHARINDEX,则改用它会容易得多:

SELECT * FROM MyTable
WHERE CHARINDEX('word1', Column1) > 0
  AND CHARINDEX('word2', Column1) > 0
  AND CHARINDEX('word3', Column1) > 0

Also, please keep in mind that this and the method in the accepted answer only cover substring matching rather than word matching. So, for example, the string 'word1word2word3'would still match.

另外,请记住,this 和已接受答案中的方法仅涵盖子字符串匹配而不是单词匹配。因此,例如,字符串'word1word2word3'仍将匹配。

回答by Eduardo Cuomo

Function

功能

 CREATE FUNCTION [dbo].[fnSplit] ( @sep CHAR(1), @str VARCHAR(512) )
 RETURNS TABLE AS
 RETURN (
           WITH Pieces(pn, start, stop) AS (
           SELECT 1, 1, CHARINDEX(@sep, @str)
           UNION ALL
           SELECT pn + 1, stop + 1, CHARINDEX(@sep, @str, stop + 1)
           FROM Pieces
           WHERE stop > 0
      )

      SELECT
           pn AS Id,
           SUBSTRING(@str, start, CASE WHEN stop > 0 THEN stop - start ELSE 512 END) AS Data
      FROM
           Pieces
 )

Query

询问

 DECLARE @FilterTable TABLE (Data VARCHAR(512))

 INSERT INTO @FilterTable (Data)
 SELECT DISTINCT S.Data
 FROM fnSplit(' ', 'word1 word2 word3') S -- Contains words

 SELECT DISTINCT
      T.*
 FROM
      MyTable T
      INNER JOIN @FilterTable F1 ON T.Column1 LIKE '%' + F1.Data + '%'
      LEFT JOIN @FilterTable F2 ON T.Column1 NOT LIKE '%' + F2.Data + '%'
 WHERE
      F2.Data IS NULL

回答by messed-up

Instead of SELECT * FROM MyTable WHERE Column1 CONTAINS 'word1 word2 word3', add And in between those words like:

代替SELECT * FROM MyTable WHERE Column1 CONTAINS 'word1 word2 word3',在这些词之间添加 And ,例如:

SELECT * FROM MyTable WHERE Column1 CONTAINS 'word1 And word2 And word3'

for details, see here https://msdn.microsoft.com/en-us/library/ms187787.aspx

有关详细信息,请参阅此处 https://msdn.microsoft.com/en-us/library/ms187787.aspx

UPDATE

更新

For selecting phrases, use double quotes like:

要选择短语,请使用双引号,例如:

SELECT * FROM MyTable WHERE Column1 CONTAINS '"Phrase one" And word2 And "Phrase Two"'

p.s.you have to first enable Full Text Search on the table before using contains keyword. for more details, See here https://docs.microsoft.com/en-us/sql/relational-databases/search/get-started-with-full-text-search

ps在使用 contains 关键字之前,您必须先在表格上启用全文搜索。有关更多详细信息,请参阅此处https://docs.microsoft.com/en-us/sql/relational-databases/search/get-started-with-full-text-search

回答by Jon Crowell

SELECT * FROM MyTable WHERE 
Column1 LIKE '%word1%'
AND Column1 LIKE '%word2%'
AND Column1 LIKE  '%word3%'

Changed ORto ANDbased on edit to question.

更改ORAND基于对问题的编辑。

回答by mirmdasif

If you are using Oracle Databasethen you can achieve this using containsquery. Contains querys are faster than like query.

如果您使用的是Oracle 数据库,那么您可以使用contains查询来实现这一点。包含查询比类似查询更快。

If you need all of the words

如果你需要所有的话

SELECT * FROM MyTable WHERE CONTAINS(Column1,'word1 and word2 and word3', 1) > 0

If you need any of the words

如果你需要任何的话

SELECT * FROM MyTable WHERE CONTAINS(Column1,'word1 or word2 or word3', 1) > 0

Contains need index of type CONTEXTon your column.

在您的列上包含需要类型CONTEXT 的索引。

CREATE INDEX SEARCH_IDX ON MyTable(Column) INDEXTYPE IS CTXSYS.CONTEXT

回答by Joshua Balan

If you just want to find a match.

如果你只是想找到一个匹配。

SELECT * FROM MyTable WHERE INSTR('word1 word2 word3',Column1)<>0

SQL Server :

SQL 服务器:

CHARINDEX(Column1, 'word1 word2 word3', 1)<>0

To get exact match. Example (';a;ab;ac;',';b;')will not get a match.

获得精确匹配。示例(';a;ab;ac;',';b;')不会得到匹配。

SELECT * FROM MyTable WHERE INSTR(';word1;word2;word3;',';'||Column1||';')<>0

回答by Anastasios Selmanis

One of the easiest ways to achiever what is mentioned in the question is by using CONTAINSwith NEAR or '~'. For example the following queries would give us all the columns that specifically include word1, word2 and word3.

实现问题中提到的最简单方法之一是使用带有 NEAR 或“~”的CONTAINS。例如,以下查询将为我们提供特别包含 word1、word2 和 word3 的所有列。

SELECT * FROM MyTable WHERE CONTAINS(Column1, 'word1 NEAR word2 NEAR word3')

SELECT * FROM MyTable WHERE CONTAINS(Column1, 'word1 ~ word2 ~ word3')

In addition, CONTAINSTABLE returns a rank for each document based on the proximity of "word1", "word2" and "word3". For example, if a document contains the sentence, "The word1 is word2 and word3," its ranking would be high because the terms are closer to one another than in other documents.

此外,CONTAINSTABLE 会根据“word1”、“word2”和“word3”的接近度为每个文档返回一个排名。例如,如果一个文档包含句子“The word1 is word2 and word3”,它的排名会很高,因为这些词比其他文档中的词更接近彼此。

One other thing that I would like to add is that we can also use proximity_term to find columns where the words are inside a specific distance between them inside the column phrase.

我想补充的另一件事是,我们还可以使用proximity_term 来查找列短语内单词在它们之间特定距离内的列。

回答by JBelfort

This should ideally be done with the help of sql server full text search if using. However, if you can't get that working on your DB for some reason, here is a performance intensive solution :-

如果使用,这应该最好在 sql server 全文搜索的帮助下完成。但是,如果由于某种原因您无法在数据库上运行,这里是一个性能密集型解决方案:-

-- table to search in
CREATE TABLE dbo.myTable
    (
    myTableId int NOT NULL IDENTITY (1, 1),
    code varchar(200) NOT NULL, 
    description varchar(200) NOT NULL -- this column contains the values we are going to search in 
    )  ON [PRIMARY]
GO

-- function to split space separated search string into individual words
CREATE FUNCTION [dbo].[fnSplit] (@StringInput nvarchar(max),
@Delimiter nvarchar(1))
RETURNS @OutputTable TABLE (
  id nvarchar(1000)
)
AS
BEGIN
  DECLARE @String nvarchar(100);

  WHILE LEN(@StringInput) > 0
  BEGIN
    SET @String = LEFT(@StringInput, ISNULL(NULLIF(CHARINDEX(@Delimiter, @StringInput) - 1, -1),
    LEN(@StringInput)));
    SET @StringInput = SUBSTRING(@StringInput, ISNULL(NULLIF(CHARINDEX
    (
    @Delimiter, @StringInput
    ),
    0
    ), LEN
    (
    @StringInput)
    )
    + 1, LEN(@StringInput));

    INSERT INTO @OutputTable (id)
      VALUES (@String);
  END;

  RETURN;
END;
GO

-- this is the search script which can be optionally converted to a stored procedure /function


declare @search varchar(max) = 'infection upper acute genito'; -- enter your search string here
-- the searched string above should give rows containing the following
-- infection in upper side with acute genitointestinal tract
-- acute infection in upper teeth
-- acute genitointestinal pain

if (len(trim(@search)) = 0) -- if search string is empty, just return records ordered alphabetically
begin
 select 1 as Priority ,myTableid, code, Description from myTable order by Description 
 return;
end

declare @splitTable Table(
wordRank int Identity(1,1), -- individual words are assinged priority order (in order of occurence/position)
word varchar(200)
)
declare @nonWordTable Table( -- table to trim out auxiliary verbs, prepositions etc. from the search
id varchar(200)
)

insert into @nonWordTable values
('of'),
('with'),
('at'),
('in'),
('for'),
('on'),
('by'),
('like'),
('up'),
('off'),
('near'),
('is'),
('are'),
(','),
(':'),
(';')

insert into @splitTable
select id from dbo.fnSplit(@search,' '); -- this function gives you a table with rows containing all the space separated words of the search like in this e.g., the output will be -
--  id
-------------
-- infection
-- upper
-- acute
-- genito

delete s from @splitTable s join @nonWordTable n  on s.word = n.id; -- trimming out non-words here
declare @countOfSearchStrings int = (select count(word) from @splitTable);  -- count of space separated words for search
declare @highestPriority int = POWER(@countOfSearchStrings,3);

with plainMatches as
(
select myTableid, @highestPriority as Priority from myTable where Description like @search  -- exact matches have highest priority
union                                      
select myTableid, @highestPriority-1 as Priority from myTable where Description like  @search + '%'  -- then with something at the end
union                                      
select myTableid, @highestPriority-2 as Priority from myTable where Description like '%' + @search -- then with something at the beginning
union                                      
select myTableid, @highestPriority-3 as Priority from myTable where Description like '%' + @search + '%' -- then if the word falls somewhere in between
),
splitWordMatches as( -- give each searched word a rank based on its position in the searched string
                     -- and calculate its char index in the field to search
select myTable.myTableid, (@countOfSearchStrings - s.wordRank) as Priority, s.word,
wordIndex = CHARINDEX(s.word, myTable.Description)  from myTable join @splitTable s on myTable.Description like '%'+ s.word + '%'
-- and not exists(select myTableid from plainMatches p where p.myTableId = myTable.myTableId) -- need not look into myTables that have already been found in plainmatches as they are highest ranked
                                                                              -- this one takes a long time though, so commenting it, will have no impact on the result
),
matchingRowsWithAllWords as (
 select myTableid, count(myTableid) as myTableCount from splitWordMatches group by(myTableid) having count(myTableid) = @countOfSearchStrings
)
, -- trim off the CTE here if you don't care about the ordering of words to be considered for priority
wordIndexRatings as( -- reverse the char indexes retrived above so that words occuring earlier have higher weightage
                     -- and then normalize them to sequential values
select s.myTableid, Priority, word, ROW_NUMBER() over (partition by s.myTableid order by wordindex desc) as comparativeWordIndex 
from splitWordMatches s join matchingRowsWithAllWords m on s.myTableId = m.myTableId
)
,
wordIndexSequenceRatings as ( -- need to do this to ensure that if the same set of words from search string is found in two rows,
                              -- their sequence in the field value is taken into account for higher priority
    select w.myTableid, w.word, (w.Priority + w.comparativeWordIndex + coalesce(sequncedPriority ,0)) as Priority
    from wordIndexRatings w left join 
    (
     select w1.myTableid, w1.priority, w1.word, w1.comparativeWordIndex, count(w1.myTableid) as sequncedPriority
     from wordIndexRatings w1 join wordIndexRatings w2 on w1.myTableId = w2.myTableId and w1.Priority > w2.Priority and w1.comparativeWordIndex>w2.comparativeWordIndex
     group by w1.myTableid, w1.priority,w1.word, w1.comparativeWordIndex
    ) 
    sequencedPriority on w.myTableId = sequencedPriority.myTableId and w.Priority = sequencedPriority.Priority
),
prioritizedSplitWordMatches as ( -- this calculates the cumulative priority for a field value
select  w1.myTableId, sum(w1.Priority) as OverallPriority from wordIndexSequenceRatings w1 join wordIndexSequenceRatings w2 on w1.myTableId =  w2.myTableId 
where w1.word <> w2.word group by w1.myTableid 
),
completeSet as (
select myTableid, priority from plainMatches -- get plain matches which should be highest ranked
union
select myTableid, OverallPriority as priority from prioritizedSplitWordMatches -- get ranked split word matches (which are ordered based on word rank in search string and sequence)
),
maximizedCompleteSet as( -- set the priority of a field value = maximum priority for that field value
select myTableid, max(priority) as Priority  from completeSet group by myTableId
)
select priority, myTable.myTableid , code, Description from maximizedCompleteSet m join myTable  on m.myTableId = myTable.myTableId 
order by Priority desc, Description -- order by priority desc to get highest rated items on top
--offset 0 rows fetch next 50 rows only -- optional paging

回答by Daryl Arenas

try to use the "tesarus search" in full text index in MS SQL Server. This is much better than using "%" in search if you have millions of records. tesarus have a small amount of memory consumption than the others. try to search this functions :)

尝试在 MS SQL Server 的全文索引中使用“tesarus 搜索”。如果您有数百万条记录,这比在搜索中使用“%”要好得多。tesarus 比其他的有少量的内存消耗。尝试搜索此功能:)