C语言我有一个程序，它可以将字符串中的所有单词都大写,程序正在工作，但我想了解代码在做什么

yduiuuwa 于 2023-05-06 发布在其他

关注(0)|答案(3)|浏览(112)

#include <stdio.h>
char *cap_string(char *str);

int main(void) {
    char str[] = "Expect the best. Prepare for the worst. Capitalize on what comes.\nhello world! hello-world 0123456hello world\thello world.hello world\n";
    char *ptr;
    ptr = cap_string(str);
    printf("%s", ptr);
    printf("%s", str);
    return (0);
}

char *cap_string(char *str)
{
    int index = 0;

    while (str[index])
    {
        while (!(str[index] >= 'a' && str[index] <= 'z'))
            index++;

        if (str[index - 1] == ' ' ||
            str[index - 1] == '\t' ||
            str[index - 1] == '\n' ||
            str[index - 1] == ',' ||
            str[index - 1] == ';' ||
            str[index - 1] == '.' ||
            str[index - 1] == '!' ||
            str[index - 1] == '?' ||
            str[index - 1] == '"' ||
            str[index - 1] == '(' ||
            str[index - 1] == ')' ||
            str[index - 1] == '{' ||
            str[index - 1] == '}' ||
            index == 0)
            str[index] -= 32;

        index++;
    }

    return (str);
}

我想知道这个循环在做什么，我就是跟不上

while (!(str[index] >= 'a' && str[index] <= 'z')){
        index++;

来源：https://stackoverflow.com/questions/76180254/i-have-a-program-that-capitalizes-all-words-of-a-string-the-program-is-working

3条答案

按热度按时间

deyfvvtc1#

首先要注意的是，这段代码完全依赖于ASCII输入或兼容，原因有两个：
1.它假设所有大写字母和所有小写字母都彼此相继。
1.它假定相应的大写字母和小写字母之间的距离为32。
一个反例是（in-？）著名的EBCDIC编码系统-承认，现在没有太多的相关性了...
ASCII具有前面提到的两个特征--如果你仔细观察一个ASCII table，你会注意到例如：A由值65编码，而a由97编码-差为32。因此，通过从a的值中减去32，您可以获得A的值-以及相应的其他字母......
具体的循环现在检查一个字母是否超出了[97;122]（数学符号：* 不是 * 97 ≤ letter ≤ 122），如果是，则仅递增索引，即跳过非小写字母。
仍然要注意，如果第一个字母是小写字母，这个程序会显示未定义的行为！
它确实检查了索引是否为0 -但为时已晚！当到达这个测试str[-1]已经 * 已经 * 被访问，所以数组访问越界，因此UB。您需要 first 来测试index是否为0，* 然后 * 您可以检查前面的字符是否匹配其中一个分隔符。
另外，如果字符串不是以小写字母结尾，那么在字符串的最后就会出现问题;然后，内部的while循环将继续迭代到字符串之外，直到它找到一个“意外”福尔斯该范围的值并修改它-尽管是在完全不同的地方，可能会做一些有害的事情！
一个 * 安全 * 的变体只需要一个小的修改：

while(str[index])
{
    if(str[index] >= 'a' && str[index] <= 'z')
    {
       if(index == 0 || /* all the other tests */)
       {
           str[index] -= 32;
       }
    }
    ++index;
}

不过我更喜欢用for循环：

for(size_t index = 0; str[index] != 0; ++index)
{
    if(...) {...}
}

更通用的解决方案（不依赖于ASCII）使用islower和toupper函数，您可能希望使用例如isspace和ispunct函数或!isalnum，用于检测是否需要更改为大写;这样代码可以看起来像（为了进一步方便，这里实现指针算术）：

for(char* p = ptr; *p; ++p)
{
    if(islower((unsigned char)*p)
    {
        if(p == ptr || !isalnum((unsigned char)p[-1])
        {
            *p = toupper((unsigned char)*p);
        }
   }
}

请注意，如果char实际上是有符号的，则必须强制转换为unsigned char，以防止扩展ASCII范围（〉127）中的字符被解释为负值。
还要注意的是，上面的代码现在也在-和_后面大写，这在最初是没有的，如果需要的话，你可能想显式地排除这些。
如果你想保留一个显式的分隔符列表，你仍然可以让测试更简单

if(p == ptr || strchr(" \t\n.,;[...]", p[-1])) { ... }

(as无论如何，测试相等性和负值不是你不需要在这里转换为无符号...）。

赞(0）回复(0）举报 2023-05-06

nzk0hqpo2#

定义了一个函数cap_string，它将给定字符串中每个单词的首字母大写，其中单词被定义为由空格、制表符、换行符、逗号、分号、句点、感叹号、问号、双引号、括号或花括号分隔的字符序列。main函数定义一个字符串str，将其传递给cap_string，然后将修改后的字符串和原始字符串打印到控制台。
但是这个循环检查主字符串中是否有空格、制表符、换行符、逗号、分号、句点、感叹号、问号、双引号、圆括号或花括号，并将其分割并将其中的第一个字母大写

赞(0）回复(0）举报 2023-05-06

41zrol4v3#

对于初学者来说，这个函数实际上是完全不正确的，因为至少它可以调用除了逻辑错误之外的未定义行为。
在这个while循环中

while (!(str[index] >= 'a' && str[index] <= 'z'))
    index++;

不检查是否遇到字符串的结尾（即是否遇到终止零字符‘\0'’）。所以这个while循环可以读取字符串以外的内存。
另一个问题是在if语句中

if (str[index - 1] == ' ' ||
        str[index - 1] == '\t' ||
        str[index - 1] == '\n' ||
        str[index - 1] == ',' ||
        str[index - 1] == ';' ||
        str[index - 1] == '.' ||
        str[index - 1] == '!' ||
        str[index - 1] == '?' ||
        str[index - 1] == '"' ||
        str[index - 1] == '(' ||
        str[index - 1] == ')' ||
        str[index - 1] == '{' ||
        str[index - 1] == '}' ||
        index == 0)

当index等于0时，则表达式index == 0之前的所有表达式再次使用表达式index - 1的负值访问字符串之外的存储器。所以至少这个条件index == 0应该是if语句中的第一个条件。
还有一个问题是，一旦发现一个小写字母，并可能被更改为大写字母，您需要跳过所有后续字母，直到没有遇到一个字母。
而这份声明

str[index] -= 32;

例如，如果使用EBCDIC字符表而不是ASCII字符表，则会产生不正确的结果。最好使用在头文件<ctype.h>中声明的标准C函数，而不是手动处理字符串中的字符。
至于你的问题那么这个while循环

while (!(str[index] >= 'a' && str[index] <= 'z')){
    index++;

指定跳过字符串中所有不是小写字母的字符，即['a', 'z']范围内的字母。
在Fact中，当字符串的第一个字母或者前面没有大写字母或非字母字符时，需要大写字母。考虑到这一点，函数可以看起来例如以下方式，如下面的演示程序中所示。

#include <stdio.h>
#include <ctype.h>

char * cap_string( char *str )
{

    char *p = str;

    do
    {
        while (*p && !islower( ( unsigned char )*p )) ++p;

        if (*p && ( p == str || !isupper( ( unsigned char )p[-1] ) ) )
        {
            *p = toupper( *p );
        }

        while (isalpha( ( unsigned char )*p ) ) ++p;
    } while ( *p );

    return str;
}

int main( void )
{
    char str[] = "Expect the best. Prepare for the worst. "
                 "Capitalize on what comes.\n"
                 "hello world! hello-world 0123456hello world\t"
                 "hello world.hello world\n";

    puts( cap_string( str ) );
}

程序输出为

Expect The Best. Prepare For The Worst. Capitalize On What Comes.
Hello World! Hello-World 0123456Hello World     Hello World.Hello World

如果在数字后也遇到字母，则所示函数将字母转换为大写。如果不想将数字后面的字母转换为大写，请更改此if语句

if (*p && ( p == str || !isupper( ( unsigned char )p[-1] ) ))

到这一个

if (*p && ( p == str || ( !isupper( ( unsigned char )p[-1] ) && !isdigit( ( unsigned char )p[-1] ) ) ) )

赞(0）回复(0）举报 2023-05-06

我来回答

C语言我有一个程序，它可以将字符串中的所有单词都大写,程序正在工作，但我想了解代码在做什么

3条答案

相关问题

热门标签

最新问答

C语言 我有一个程序，它可以将字符串中的所有单词都大写,程序正在工作，但我想了解代码在做什么

3条答案

相关问题

热门标签

最新问答

C语言我有一个程序，它可以将字符串中的所有单词都大写,程序正在工作，但我想了解代码在做什么