带有HTML标记的字符串单词未正确输出

sulc1iza  于 2023-08-01  发布在  其他
关注(0)|答案(6)|浏览(120)

我正面临着一个与HTML Package 标志词有关的问题。标志字响应来自第三方服务,作为对象数组。段落文本是这样的:

Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense.\\nHi, my nme is John, and I am from uas.

字符串
标志字响应是这样的:

[
  { offset: 7, token: 'nme', type: 'UnknownToken' },
  { offset: 52, token: 'dones', type: 'UnknownToken' },
  { offset: 58, token: 'mke', type: 'UnknownToken' }
]


我想用下面的HTML标记 Package 标志标记。

<span class="underline">nme</span>


响应已经有了offsettoken键,但是我的逻辑没有正确地处理这些键,我正在替换偏移索引上的字符串,但是输出不正确。

我的逻辑:

function replaceAt(str, index, replacement) {
  return (
    str.substring(0, index) +
    replacement +
    str.substring(index + replacement.length)
  );
}

let input = `Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense\\nHi, my nme is John, and I am from uas.`;
const flagTokens = [
  { offset: 7, token: "nme", type: "UnknownToken" },
  { offset: 52, token: "dones", type: "UnknownToken" },
  { offset: 58, token: "mke", type: "UnknownToken" },
];

flagTokens.forEach((item) => {
  input = replaceAt(
    input,
    item.offset,
    `<span class="underline">${item.token}</span>`
  );
});

console.log("Output:", input);

输出:

Hi, my <span class="underline">nme</span>his sentce <span <span class="underline">mke</span>


如何解决这个问题?

x7rlezfr

x7rlezfr1#

你的代码有两个问题。
第一个是在替换第一个标记后,字符串变长了,现在其余标记的偏移量是错误的,因为它们向前移动了。一个解决方案是从最后一个到第一个应用替换(* 因为它们是按偏移顺序排序的 *)
第二个问题是,您的替换方法replaceAt在插入替换文本后,它继续添加str.substring(index + replacement.length)。但这是错误的。您只需要添加token长度,而不是replacement长度。所以你也应该在函数中传递它。

function replaceAt(str, index, replacement, length) {
  return (
    str.substring(0, index) +
    replacement +
    str.substring(index + length)
  );
}

let input = `Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense.`;
const flagTokens = [
  { offset: 7, token: "nme", type: "UnknownToken" },
  { offset: 52, token: "dones", type: "UnknownToken" },
  { offset: 58, token: "mke", type: "UnknownToken" },
];

// using .reverse() here to apply the replacements
// in reverse order (from last to first)
flagTokens.reverse().forEach((item) => {
  input = replaceAt(
    input,
    item.offset,
    `<span class="underline">${item.token}</span>`, 
    item.token.length
  );
});

console.log("Output:", input);

字符串

0ejtzxu1

0ejtzxu12#

使用replace()代替

const flags = [
  { offset: 7, token: 'nme', type: 'UnknownToken' },
  { offset: 52, token: 'dones', type: 'UnknownToken' },
  { offset: 58, token: 'mke', type: 'UnknownToken' }
];

// Reference the element that has the text
const p = document.querySelector("p");
// Get the element's HTML content as a htmlString;
let string = p.innerHTML;

/**
 * For each object (flag) of array flags...
 *
 * Create a regex that is a capture group (...) of the
 * current object's token value and set the (g)lobal flag
 *
 * Replace any occurrence that matches the token with itself
 * ($1) wrapped in <u>...</u>.
 */
flags.forEach(flag => {
  let rgx = new RegExp(`(${flag.token})`, "g");
  string = string.replace(rgx, "<u>$1</u>");
});

// Parse the HTML of <p> with the modified string
p.innerHTML = string;

个字符

enxuqcxy

enxuqcxy3#

替代解决方案。这个错误被加布里埃尔漂亮地强调了出来。

let input = `Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense\\nHi, my nme is John, and I am from uas.`;
const flagTokens = [
  { offset: 7, token: "nme", type: "UnknownToken" },
  { offset: 52, token: "dones", type: "UnknownToken" },
  { offset: 58, token: "mke", type: "UnknownToken" },
];

let output = input + "";

flagTokens.forEach(t => output = output.replace(t.token, `<span class="underline">${t.token}</span>`));

console.log("Output:", input);

document.getElementById('input').innerHTML = input;
document.getElementById('result').innerHTML = output;
.underline {
  text-decoration: underline
}
<div>Input: <p id="input"/> </div>
<div>Result: <p id="result"/></div>
dxxyhpgq

dxxyhpgq4#

我认为.replace()是一个很好的和优雅的解决方案,一般情况下,你想“下划线”所有或只是第一次出现的给定令牌,但它的斗争,如果你想强调那些在特定的索引,就像你在你的flagToken数组,因为你可能不想强调所有或只是第一次出现的给定令牌,而是如果它们出现多次则是特定的一个。
一个选项是循环遍历字符串。如果索引没有出现在flagTokens数组中,那么可以将当前字符添加到结果字符串(res)中。如果索引在数组中显示为offset,则可以将 Package 在<span>中的标记添加到结果字符串res中,然后将token的长度添加到当前索引(跳过标记的其余部分),然后处理字符串的其余索引。通过保持原始的input字符串不变,您不必担心根据<span>...</span>添加的字符调整偏移量,因为这些字符被添加到res中,例如:

let input = `Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense`;
const flagTokens = [
  { offset: 7, token: "nme", type: "UnknownToken" },
  { offset: 52, token: "dones", type: "UnknownToken" },
  { offset: 58, token: "mke", type: "UnknownToken" },
];

const flagMap = new Map(flagTokens.map(o => [o.offset, o]));

let res = "";
let i = 0;
while(i < input.length) {
  let item = flagMap.get(i);
  if(item) {
    res += `<span class="underline">${item.token}</span>`;
    i += item.token.length;
  } else {
    res += input[i];
    i++;
  }
}

console.log("Output:", res);
document.body.innerHTML = res;

个字符

ybzsozfc

ybzsozfc5#

最简单的解决方案是从最后开始,这样你就不必担心要添加到字符串中的字符。

const input = 'Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense.\\nHi, my nme is John, and I am from uas.'

const flagTokens = [
  { offset: 7, token: 'nme', type: 'UnknownToken' },
  { offset: 52, token: 'dones', type: 'UnknownToken' },
  { offset: 58, token: 'mke', type: 'UnknownToken' }
 ];
 
 const replaceTokens = (input, tokens) => {
   const sorted = [...tokens].sort((a,b) => b.offset - a.offset);
   sorted.forEach(({offset, token, type}) => {
     const replacement = `<span class="underline">${token}</span>`;
     input = input.substring(0,offset) + replacement + input.substring(offset + token.length)
   });
   return input;
 }

const result = replaceTokens(input, flagTokens);
console.log(result);

字符串
另一种选择是跟踪您添加的内容并将其添加到偏移量中

const input = 'Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense.\\nHi, my nme is John, and I am from uas.'

const flagTokens = [
  { offset: 7, token: 'nme', type: 'UnknownToken' },
  { offset: 52, token: 'dones', type: 'UnknownToken' },
  { offset: 58, token: 'mke', type: 'UnknownToken' }
 ];
 
 const replaceTokens = (input, tokens) => {
   let insertedOffset = 0;
   tokens.forEach(({offset, token, type}) => {
     const replacement = `<span class="underline">${token}</span>`;
     input = input.substring(0,offset + insertedOffset) + replacement + input.substring(offset + insertedOffset + token.length)
     insertedOffset += replacement.length - token.length; 

   });
   return input;
 }

const result = replaceTokens(input, flagTokens);
console.log(result);

r1zhe5dt

r1zhe5dt6#

也许这个解决方案就是你正在寻找的

function replaceAt(str, index, replacement) {
  return (
    str.substring(0, index) +
    replacement +
    str.substring(index + replacement.length)
  );
}

let input = `Hi, my nme is John, and I am from uas.\\nthis sentce dones mke sense`;
const flagTokens = [
  { offset: 7, token: "nme", type: "UnknownToken" },
  { offset: 52, token: "dones", type: "UnknownToken" },
  { offset: 58, token: "mke", type: "UnknownToken" },
];
let myTokenIndex = 1;

flagTokens.forEach((item) => {
    
  input = replaceAt(
    input,
    item.offset + (31*myTokenIndex),
    `<span class="underline">${item.token}</span>`
  );
    myTokenIndex++;
});

console.log("Output:", input);

字符串
产出

Output: Hi, my nme is John, and I am from uas.<span class="underline">nme</span><span class="underline">dones</span><span class="underline">mke</span>

相关问题