php 如何将 PascalCase 转换为 pascal_case?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1993721/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert PascalCase to pascal_case?
提问by openfrog
If I had:
如果我有:
$string = "PascalCase";
I need
我需要
"pascal_case"
Does PHP offer a function for this purpose?
PHP 是否为此提供了一个函数?
回答by cletus
Try this on for size:
试试这个尺寸:
$tests = array(
'simpleTest' => 'simple_test',
'easy' => 'easy',
'HTML' => 'html',
'simpleXML' => 'simple_xml',
'PDFLoad' => 'pdf_load',
'startMIDDLELast' => 'start_middle_last',
'AString' => 'a_string',
'Some4Numbers234' => 'some4_numbers234',
'TEST123String' => 'test123_string',
);
foreach ($tests as $test => $result) {
$output = from_camel_case($test);
if ($output === $result) {
echo "Pass: $test => $result\n";
} else {
echo "Fail: $test => $result [$output]\n";
}
}
function from_camel_case($input) {
preg_match_all('!([A-Z][A-Z0-9]*(?=$|[A-Z][a-z0-9])|[A-Za-z][a-z0-9]+)!', $input, $matches);
$ret = $matches[0];
foreach ($ret as &$match) {
$match = $match == strtoupper($match) ? strtolower($match) : lcfirst($match);
}
return implode('_', $ret);
}
Output:
输出:
Pass: simpleTest => simple_test
Pass: easy => easy
Pass: HTML => html
Pass: simpleXML => simple_xml
Pass: PDFLoad => pdf_load
Pass: startMIDDLELast => start_middle_last
Pass: AString => a_string
Pass: Some4Numbers234 => some4_numbers234
Pass: TEST123String => test123_string
This implements the following rules:
这实现了以下规则:
- A sequence beginning with a lowercase letter must be followed by lowercase letters and digits;
- A sequence beginning with an uppercase letter can be followed by either:
- one or more uppercase letters and digits (followed by either the end of the string or an uppercase letter followed by a lowercase letter or digit ie the start of the next sequence); or
- one or more lowercase letters or digits.
- 以小写字母开头的序列必须后跟小写字母和数字;
- 以大写字母开头的序列后面可以跟以下任一项:
- 一个或多个大写字母和数字(后跟字符串的结尾或大写字母后跟小写字母或数字,即下一个序列的开始);或者
- 一个或多个小写字母或数字。
回答by Jan Jake?
A shorter solution: Similar to the editor'sone with a simplified regular expression and fixing the "trailing-underscore" problem:
更短的解决方案:类似于编辑器的简化正则表达式并修复“尾随下划线”问题:
$output = strtolower(preg_replace('/(?<!^)[A-Z]/', '_$output = ltrim(strtolower(preg_replace('/[A-Z]([A-Z](?![a-z]))*/', '_function decamelize($string) {
return strtolower(preg_replace(['/([a-z\d])([A-Z])/', '/([^_])([A-Z][a-z])/'], '_', $string));
}
', $input)), '_');
', $input));
Note that cases like SimpleXMLwill be converted to simple_x_m_lusing the above solution. That can also be considered a wrong usage of camel case notation (correct would be SimpleXml) rather than a bug of the algorithm since such cases are always ambiguous - even by grouping uppercase characters to one string (simple_xml) such algorithm will always fail in other edge cases like XMLHTMLConverteror one-letter words near abbreviations, etc. If you don't mind about the (rather rare) edge cases and want to handle SimpleXMLcorrectly, you can use a little more complex solution:
请注意,此类情况SimpleXML将转换为simple_x_m_l使用上述解决方案。这也可以被认为是骆驼大小写符号的错误用法(正确的是SimpleXml)而不是算法的错误,因为这种情况总是不明确的 - 即使将大写字符分组为一个字符串(simple_xml),这种算法在其他边缘情况下总是会失败喜欢XMLHTMLConverter或接近缩写的单字母词等。如果您不介意(相当罕见的)边缘情况并希望SimpleXML正确处理,您可以使用更复杂的解决方案:
simpleTest => simple_test
easy => easy
HTML => html
simpleXML => simple_xml
PDFLoad => pdf_load
startMIDDLELast => start_middle_last
AString => a_string
Some4Numbers234 => some4_numbers234
TEST123String => test123_string
hello_world => hello_world
hello__world => hello__world
_hello_world_ => _hello_world_
hello_World => hello_world
HelloWorld => hello_world
helloWorldFoo => hello_world_foo
hello-world => hello-world
myHTMLFiLe => my_html_fi_le
aBaBaB => a_ba_ba_b
BaBaBa => ba_ba_ba
libC => lib_c
回答by Syone
A concise solution and can handle some tricky use cases:
一个简洁的解决方案,可以处理一些棘手的用例:
function decamelize($word) {
return preg_replace(
'/(^|[a-z])([A-Z])/e',
'strtolower(strlen("\1") ? "\1_\2" : "\2")',
$word
);
}
function camelize($word) {
return preg_replace('/(^|_)([a-z])/e', 'strtoupper("\2")', $word);
}
Can handle all these cases:
可以处理所有这些情况:
$nameConverter = new CamelCaseToSnakeCaseNameConverter();
echo $nameConverter->normalize('camelCase');
// outputs: camel_case
echo $nameConverter->denormalize('snake_case');
// outputs: snakeCase
You can test this function here: http://syframework.alwaysdata.net/decamelize
您可以在此处测试此功能:http: //syframework.alwaysdata.net/decamelize
回答by user644783
Ported from Ruby's String#camelizeand String#decamelize.
从 RubyString#camelize和String#decamelize.
$underscored = strtolower(
preg_replace(
["/([A-Z]+)/", "/_([A-Z]+)([A-Z][a-z])/"],
["_", "__"],
lcfirst($camelCase)
)
);
One trick the above solutions may have missed is the 'e' modifier which causes preg_replaceto evaluate the replacement string as PHP code.
上述解决方案可能遗漏的一个技巧是“e”修饰符,它导致preg_replace将替换字符串评估为 PHP 代码。
回答by matthew
The Symfony Serializer Componenthas a CamelCaseToSnakeCaseNameConverterthat has two methods normalize()and denormalize(). These can be used as follows:
该Symfony的串行组件具有CamelCaseToSnakeCaseNameConverter有两种方法normalize()和denormalize()。这些可以按如下方式使用:
function uncamelize($camel,$splitter="_") {
$camel=preg_replace('/(?!^)[[:upper:]][[:lower:]]/', '$camelized="thisStringIsCamelized";
echo uncamelize($camelized,"_");
//echoes "this_string_is_camelized"
echo uncamelize($camelized,"-");
//echoes "this-string-is-camelized"
', preg_replace('/(?!^)[[:upper:]]+/', $splitter.'header('content-type: text/html; charset=utf-8');
$separated = preg_replace('%(?<!^)\p{Lu}%usD', '_function decamelize($word) {
return $word = preg_replace_callback(
"/(^|[a-z])([A-Z])/",
function($m) { return strtolower(strlen($m[1]) ? "$m[1]_$m[2]" : "$m[2]"); },
$word
);
}
function camelize($word) {
return $word = preg_replace_callback(
"/(^|_)([a-z])/",
function($m) { return strtoupper("$m[2]"); },
$word
);
}
', 'AaaaBbbbCcccDdddáááá????');
$lower = mb_strtolower($separated, 'utf-8');
echo $lower; //aaaa_bbbb_cccc_dddd_áááá_????
', $camel));
return strtolower($camel);
}
回答by buley
Most solutions here feel heavy handed. Here's what I use:
这里的大多数解决方案都感觉很笨拙。这是我使用的:
function uncamelize($str)
{
$str = lcfirst($str);
$lc = strtolower($str);
$result = '';
$length = strlen($str);
for ($i = 0; $i < $length; $i++) {
$result .= ($str[$i] == $lc[$i] ? '' : '_') . $lc[$i];
}
return $result;
}
echo uncamelize('HelloAWorld'); //hello_a_world
"CamelCASE" is converted to "camel_case"
“CamelCASE”转换为“camel_case”
lcfirst($camelCase)will lower the first character (avoids 'CamelCASE' converted output to start with an underscore)[A-Z]finds capital letters+will treat every consecutive uppercase as a word (avoids 'CamelCASE' to be converted to camel_C_A_S_E)- Second pattern and replacement are for
ThoseSPECCases->those_spec_casesinstead ofthose_speccases strtolower([…])turns the output to lowercases
lcfirst($camelCase)将降低第一个字符(避免 'CamelCASE' 转换后的输出以下划线开头)[A-Z]找到大写字母+将每个连续的大写字母视为一个单词(避免将 'CamelCASE' 转换为 camel_C_A_S_E)- 第二个模式和替换用于
ThoseSPECCases->those_spec_cases而不是those_speccases strtolower([…])将输出转为小写
回答by ekhaled
php does not offer a built in function for this afaik, but here is what I use
php 没有为此 afaik 提供内置函数,但这是我使用的
##代码##the splitter can be specified in the function call, so you can call it like so
可以在函数调用中指定拆分器,因此您可以像这样调用它
##代码##回答by inf3rno
You need to run a regex through it that matches every uppercase letter except if it is in the beginning and replace it with underscrore plus that letter. An utf-8 solution is this:
您需要通过它运行一个匹配每个大写字母的正则表达式,除非它在开头并用下划线加上那个字母替换它。一个 utf-8 解决方案是这样的:
##代码##If you are not sure what case your string is, better to check it first, because this code assumes that the input is camelCaseinstead of underscore_Caseor dash-Case, so if the latters have uppercase letters, it will add underscores to them.
如果您不确定您的字符串是什么情况,最好先检查它,因为此代码假定输入是camelCase而不是underscore_Caseor dash-Case,因此如果后者有大写字母,它将为它们添加下划线。
The accepted answer from cletus is way too overcomplicated imho and it works only with latin characters. I find it a really bad solution and wonder why it was accepted at all. Converting TEST123Stringinto test123_stringis not necessarily a valid requirement. I rather kept it simple and separated ABCcccinto a_b_ccccinstead of ab_ccccbecause it does not lose information this way and the backward conversion will give the exact same string we started with. Even if you want to do it the other way it is relative easy to write a regex for it with positive lookbehind (?<!^)\p{Lu}\p{Ll}|(?<=\p{Ll})\p{Lu}or two regexes without lookbehind if you are not a regex expert. There is no need to split it up into substrings not to mention deciding between strtolowerand lcfirstwhere using just strtolowerwould be completely fine.
cletus 接受的答案过于复杂,恕我直言,它仅适用于拉丁字符。我发现这是一个非常糟糕的解决方案,并想知道为什么它被接受了。转换TEST123String为test123_string不一定是有效的要求。我宁愿保持它的简单和分隔ABCccc成a_b_cccc,而不是ab_cccc因为它不会丢失信息,这种方式和落后的转换会给我们开始使用完全相同的字符串。即使您想以另一种方式来做,(?<!^)\p{Lu}\p{Ll}|(?<=\p{Ll})\p{Lu}如果您不是正则表达式专家,也可以相对容易地为它编写一个带有正向后视的正则表达式或两个没有后视的正则表达式。有没有必要把它分解成子更不用说之间作出决定strtolower,并lcfirst在那里只用strtolower将是完全没问题。
回答by shacharsol
If you are looking for a PHP 5.4 version and later answer here is the code:
如果您正在寻找 PHP 5.4 版本及更高版本,请回答以下代码:
##代码##回答by Edakos
Not fancy at all but simple and speedy as hell:
一点也不花哨,但简单而快速:
##代码##
