LeetCode-5382 HTML 实体解析器

12 Apr 2020 1223字 5分算法
CC BY 4.0（除特别声明或转载文章外）

📝题目

「HTML 实体解析器」 是一种特殊的解析器，它将 HTML 代码作为输入，并用字符本身替换掉所有这些特殊的字符实体。

HTML 里这些特殊字符和它们对应的字符实体包括：

双引号：字符实体为 &quot; ，对应的字符是 “ 。
单引号：字符实体为 &apos; ，对应的字符是 ' 。
与符号：字符实体为 &amp; ，对应对的字符是 & 。
大于号：字符实体为 &gt; ，对应的字符是 > 。
小于号：字符实体为 &lt; ，对应的字符是 < 。
斜线号：字符实体为 &frasl; ，对应的字符是 / 。
给你输入字符串 text ，请你实现一个 HTML 实体解析器，返回解析器解析后的结果。

示例 1：

输入：text = "&amp; is an HTML entity but &ambassador; is not."
输出："& is an HTML entity but &ambassador; is not."
解释：解析器把字符实体 &amp; 用 & 替换

示例 2：

输入：text = "and I quote: &quot;...&quot;"
输出："and I quote: \"...\""

示例 3：

输入：text = "Stay home! Practice on Leetcode :)"
输出："Stay home! Practice on Leetcode :)"

📝思路

巧妙地用上了上次提到的双指针操作。字符的解析对应用数组存储。

📝题解

string entityParser(string text) {
    string str1[6] = {"&quot;","&apos;","&amp;","&gt;","&lt;","&frasl;"};
    string str2[6] = {"\"","\'","&",">","<","/"};
    
    string result = "";
    int go = 0;
    int mark = 0;
    int len = text.size();
    
    for (int i = 0; i < len; ++i){
        if (text[mark] != '&'){
            result += text[mark];
            ++mark;
            ++go;
        }
        else {
            if (mark == go){
                ++go; 
                continue;
            }
            else if (go-mark >= 7){
                string tmp = text.substr(mark, go-mark+1);
                result += tmp;
                ++go;
                mark = go;
            }
            else {
                string tmp = text.substr(mark, go-mark+1);
                bool found = false;
                for (int j = 0; j < 6; ++j){
                    if (tmp == str1[j]){
                        result += str2[j];
                        ++go;
                        mark = go;
                        found = true;
                        break;
                    }
                }
                if (!found) ++go;
            }
        }
    }
    
    return result;
}

LeetCode-5382 HTML 实体解析器 粒子

📝题目

📝思路

📝题解

LeetCode-5382 HTML 实体解析器粒子