在java中修剪日语字符串时出现问题

gywdnpxw 于 2023-04-04 发布在 Java

关注(0)|答案(6)|浏览(143)

I have the following string (japanese) " ユーザー名" , the first character is "like" whitespace but its number in unicode is 12288, so if I do " ユーザー名".trim() I get the same string (trim doesn't work). If i do trim in c++ it works ok. Does anyone know how to solve this issue in java? Is there a special trim method for unicode?

Java

来源：https://stackoverflow.com/questions/479825/problem-trimming-japanese-string-in-java

6条答案

按热度按时间

4bbkushb1#

作为Mike提到的StringUtils类的替代，您还可以使用Unicode感知的正则表达式，仅使用Java自己的库：

"　ユーザー名".replaceAll("\\p{Z}", "")

或者，实际上只是修剪，而不是删除字符串中的空格：

"　ユーザ ー名 ".replaceAll("(^\\p{Z}+|\\p{Z}+$)", "")

赞(0）回复(0）举报 2023-04-04

mepcadol2#

看看Unicode Normalization和Normalizer类。该类是Java 6中的新类，但如果您使用的是早期的JRE，则可以在ICU4J库中找到等效版本。

int character = 12288;
    char[] ch = Character.toChars(character);
    String input = new String(ch);
    String normalized = Normalizer.normalize(input, Normalizer.Form.NFKC);

    System.out.println("Hex value:\t" + Integer.toHexString(character));
    System.out.println("Trimmed length           :\t"
            + input.trim().length());
    System.out.println("Normalized trimmed length:\t"
            + normalized.trim().length());

赞(0）回复(0）举报 2023-04-04

cu6pst1q3#

尝试ApacheCommons的StringUtils类。StringUtils.strip（）方法应该可以为您工作。

赞(0）回复(0）举报 2023-04-04

waxmsbnn4#

从java文档中，它解释了为什么这不起作用。
如果此String对象表示空字符序列，或者此String对象表示的字符序列的第一个和最后一个字符的代码都大于'\u0020'（空格字符），则返回对此String对象的引用。
你可以很容易地扮演你自己的版本。也许codePointAt方法可以用于此目的。
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html

赞(0）回复(0）举报 2023-04-04

jc3wubiy5#

您必须基于Character.isWhitespace()编写自己的trim()方法-不幸的是，trim()并不像其API文档所声称的那样：它只去除ASCII空格，而不去除任何其他类型的空格。

赞(0）回复(0）举报 2023-04-04

mrphzbgm6#

我认为这是简单的方法来修剪日本字符串在java中

public static int getTrimmedLength(CharSequence s) {
    int len = s.length();

    int start = 0;
    while (start < len && Character.isWhitespace(s.charAt(start))) {
        start++;
    }

    int end = len;
    while (end > start && Character.isWhitespace(s.charAt(end - 1))) {
        end--;
    }

    return end - start;
}

public static String trimWhitespace(CharSequence s) {
    StringBuilder sb = new StringBuilder(s);

    while (sb.length() > 0 && Character.isWhitespace(sb.charAt(0))) {
        sb.deleteCharAt(0);
    }

    while (sb.length() > 0 && Character.isWhitespace(sb.charAt(sb.length() - 1))) {
        sb.deleteCharAt(sb.length() - 1);
    }

    return sb.toString();
}

赞(0）回复(0）举报 2023-04-04

我来回答

在java中修剪日语字符串时出现问题

6条答案

相关问题

热门标签

最新问答