C++ 查找两个字符串中的公共字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21657544/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Finding common characters in two strings
提问by nomorequestions
I am coding for the problem in which we got to count the number of common characters in two strings. Main part of the count goes like this
我正在为我们必须计算两个字符串中常见字符数的问题编码。计数的主要部分是这样的
for(i=0; i < strlen(s1); i++) {
for(j = 0; j < strlen(s2); j++) {
if(s1[i] == s2[j]) {
count++;
s2[j] = '*';
break;
}
}
}
This goes with an O(n^2) logic. However I could not think of a better solution than this. Can anyone help me in coding with an O(n) logic.
这符合 O(n^2) 逻辑。但是,我想不出比这更好的解决方案。任何人都可以帮助我使用 O(n) 逻辑进行编码。
回答by haccks
This is very simple. Take two int
arrays freq1
and freq2
. Initialize all its elements to 0
. Then read your strings and store the frequencies of the characters to these arrays. After that compare the arrays freq1
and freq2
to find the common characters.
这很简单。取两个int
数组freq1
和freq2
。将其所有元素初始化为0
. 然后读取您的字符串并将字符的频率存储到这些数组中。之后比较数组freq1
并freq2
找到公共字符。
回答by axiom
It can be done in O(n) time with constant space.
它可以在 O(n) 时间内以恒定空间完成。
The pseudo code goes like this :
伪代码是这样的:
int map1[26], map2[26];
int common_chars = 0;
for c1 in string1:
map1[c1]++;
for c2 in string2:
map2[c2]++;
for i in 1 to 26:
common_chars += min(map1[i], map2[i]);
回答by Paul Hankin
Your current code is O(n^3) because of the O(n) strlen
s and produces incorrect results, for example on "aa", "aa" (which your code will return 4).
由于 O(n) strlen
s,您当前的代码是 O(n^3)并产生不正确的结果,例如“aa”、“aa”(您的代码将返回 4)。
This code counts letters in common (each letter being counted at most once) in O(n).
此代码以 O(n) 计算共同的字母(每个字母最多计数一次)。
int common(const char *a, const char *b) {
int table[256] = {0};
int result = 0;
for (; *a; a++)table[*a]++;
for (; *b; b++)result += (table[*b]-- > 0);
return result;
}
Depending on how you define "letters in common", you may have different logic. Here's some testcases for the definition I'm using (which is size of the multiset intersection).
根据您如何定义“共同字母”,您可能有不同的逻辑。这是我正在使用的定义的一些测试用例(即多集交集的大小)。
int main(int argc, char *argv[]) {
struct { const char *a, *b; int want; } cases[] = {
{"a", "a", 1},
{"a", "b", 0},
{"a", "aa", 1},
{"aa", "a", 1},
{"ccc", "cccc", 3},
{"aaa", "aaa", 3},
{"abc", "cba", 3},
{"aasa", "asad", 3},
};
int fail = 0;
for (int i = 0; i < sizeof(cases) / sizeof(*cases); i++) {
int got = common(cases[i].a, cases[i].b);
if (got != cases[i].want) {
fail = 1;
printf("common(%s, %s) = %d, want %d\n",
cases[i].a, cases[i].b, got, cases[i].want);
}
}
return fail;
}
回答by Mustafa Chelik
You can do it with 2n:
你可以用 2n 来做到:
int i,j, len1 = strlen(s1), len2 = strlen(s2);
unsigned char allChars[256] = { 0 };
int count = 0;
for( i=0; i<len1; i++ )
{
allChars[ (unsigned char) s1[i] ] = 1;
}
for( i=0; i<len2; i++ )
{
if( allChars[ (unsigned char) s1[i] ] == 1 )
{
allChars[ (unsigned char) s2[i] ] = 2;
}
}
for( i=0; i<256; i++ )
{
if( allChars[i] == 2 )
{
cout << allChars[i] << endl;
count++;
}
}
回答by Shridhar.S
Following code traverses each sting only once. So the complexity is O(n). One of the assumptions is that the upper and lower cases are considered same.
以下代码仅遍历每个刺一次。所以复杂度是O(n)。假设之一是大写和小写被认为是相同的。
#include<stdio.h>
int main() {
char a[] = "Hello world";
char b[] = "woowrd";
int x[26] = {0};
int i;
int index;
for (i = 0; a[i] != 'int count(string a, string b)
{
int i,c[26]={0},c1[26]={};
for(i=0;i<a.length();i++)
{
if(97<=a[i]&&a[i]<=123)
c[a[i]-97]++;
}
for(i=0;i<b.length();i++)
{
if(97<=b[i]&&b[i]<=123)
c1[b[i]-97]++;
}
int s=0;
for(i=0;i<26;i++)
{
s=s+abs(c[i]+c1[i]-(c[i]-c1[i]));
}
return (s);
}
'; i++) {
index = a[i] - 'a';
if (index > 26) {
//capital char
index = a[i] - 'A';
}
x[index]++;
}
for (i = 0; b[i] != '
std::vector<char> strIntersect(std::string const&s1, std::string const&s2){
std::vector<bool> presents(256, false); //Assuming ASCII
std::vector<char> intersection;
for (auto c : s1) {
presents[c] = true;
}
for (auto c : s2) {
if (presents[c]){
intersection.push_back(c);
presents[c] = false;
}
}
return intersection;
}
int main() {
std::vector<char> result;
std::string s1 = "El perro de San Roque no tiene rabo, porque Ramon Rodriguez se lo ha cortado";
std::string s2 = "Saint Roque's dog has no tail, because Ramon Rodriguez chopped it off";
//Expected: "S a i n t R o q u e s d g h l , b c m r z p"
result = strIntersect(s1, s2);
for (auto c : result) {
std::cout << c << " ";
}
std::cout << std::endl;
return 0;
}
'; i++) {
index = b[i] - 'a';
if (index > 26) {
//capital char
index = b[i] - 'A';
}
if (x[index] > 0)
x[index] = -1;
}
printf("Common characters in '%s' and '%s' are ", a, b);
for (i = 0; i < 26; i++) {
if (x[i] < 0)
printf("%c", 'a'+i);
}
printf("\n");
}
回答by Raj Katta
// considering the strings to be of lower case.
int main()
{
string s1,s2;
cin>>s1>>s2;
//Declaration for bitset type variables
bitset<26> b_s1,b_s2;
// setting the bits in b_s1 for the encountered characters of string s1
for(auto& i : s1)
{
if(!b_s1[i-'a'])
b_s1[i-'a'] = 1;
}
// setting the bits in b_s2 for the encountered characters of string s2
for(auto& i : s2)
{
if(!b_s2[i-'a'])
b_s2[i-'a'] = 1;
}
// counting the number of set bits by the "Logical AND" operation
// between b_s1 and b_s2
cout<<(b_s1&b_s2).count();
}
This is much easier and better solution
这是更容易和更好的解决方案
回答by Visiedo
First, your code does not run in O(n^2), it runs in O(nm), where n and m are the length of each string.
首先,您的代码不是在 O(n^2) 中运行,而是在 O(nm) 中运行,其中 n 和 m 是每个字符串的长度。
You can do it in O(n+m), but not better, since you have to go through each string, at least once, to see if a character is in both.
您可以在 O(n+m) 中完成它,但不是更好,因为您必须遍历每个字符串,至少一次,以查看一个字符是否在两个字符串中。
An example in C++, assuming:
C++ 中的一个示例,假设:
- ASCII characters
- All characters included (letters, numbers, special, spaces, etc...)
- Case sensitive
- ASCII 字符
- 包括所有字符(字母、数字、特殊字符、空格等...)
- 区分大小写
#define ALPHABETS_COUNT 26
int commonChars(char *s1, char *s2)
{
int c_count = 0, i;
int arr1[ALPHABETS_COUNT] = {0}, arr2[ALPHABETS_COUNT] = {0};
/* Compute the number of occurances of each character */
while (*s1) arr1[*s1++-'a'] += 1;
while (*s2) arr2[*s2++-'a'] += 1;
/* Increment count based on match found */
for(i=0; i<ALPHABETS_COUNT; i++) {
if(arr1[i] == arr2[i]) c_count += arr1[i];
else if(arr1[i]>arr2[i] && arr2[i] != 0) c_count += arr2[i];
else if(arr2[i]>arr1[i] && arr1[i] != 0) c_count += arr1[i];
}
return c_count;
回答by ARUPJYOTI DUTTA
Their is a more better version in c++ : C++ bitset and its application A bitset is an array of bool but each Boolean value is not stored separately instead bitset optimizes the space such that each bool takes 1 bit space only, so space taken by bitset bs is less than that of bool bs[N] and vector bs(N). However, a limitation of bitset is, N must be known at compile time, i.e., a constant (this limitation is not there with vector and dynamic array)
它们是 C++ 中更好的版本:C++ bitset 及其应用 bitset 是一个 bool 数组,但每个布尔值不是单独存储的,而是 bitset 优化了空间,使得每个 bool 只占用 1 位空间,因此 bitset bs 占用了空间小于 bool bs[N] 和向量 bs(N)。然而,bitset 的一个限制是,N 必须在编译时已知,即一个常量(向量和动态数组没有这个限制)
As bitset stores the same information in compressed manner the operation on bitset are faster than that of array and vector. We can access each bit of bitset individually with help of array indexing operator [] that is bs[3] shows bit at index 3 of bitset bs just like a simple array. Remember bitset starts its indexing backward that is for 10110, 0 are at 0th and 3rd indices whereas 1 are at 1st 2nd and 4th indices. We can construct a bitset using integer number as well as binary string via constructors which is shown in below code. The size of bitset is fixed at compile time that is, it can't be changed at runtime. For more information about bitset visit the site : https://www.geeksforgeeks.org/c-bitset-and-its-application
由于 bitset 以压缩方式存储相同的信息,因此对 bitset 的操作比对数组和向量的操作要快。我们可以在数组索引运算符 [] 的帮助下单独访问 bitset 的每一位,即 bs[3] 像一个简单的数组一样显示 bitset bs 的索引 3 处的位。请记住,bitset 开始向后索引 10110,0 位于第 0 和第 3 个索引,而 1 位于第 2 个和第 4 个索引。我们可以通过构造函数使用整数和二进制字符串构造一个位集,如下面的代码所示。bitset 的大小在编译时是固定的,即不能在运行时更改。有关 bitset 的更多信息,请访问该站点:https: //www.geeksforgeeks.org/c-bitset-and-its-application
The code is as follows :
代码如下:
for (std::vector<char>::iterator i = s1.begin(); i != s1.end(); ++i)
{
if (std::find(s2.begin(), s2.end(), *i) != s2.end())
{
dest.push_back(*i);
}
}
回答by Betran Jacob
C implementation to run in O(n) time and constant space.
在 O(n) 时间和恒定空间中运行的 C 实现。
##代码##}
}