Python的urllib.parse.quote()和urllib.parse.unquote()的等价JavaScript函数

bvpmtnay 于 2023-03-21 发布在 Java

关注(0)|答案(8)|浏览(128)

Python的urllib.parse.quote()和urllib.parse.unquote()有没有等价的JavaScript函数？
我遇到的最接近的是encodeURI()/encodeURIComponent()和escape()（以及它们相应的非编码函数），但据我所知，它们并不编码/解码同一组特殊字符。

JavaScript

来源：https://stackoverflow.com/questions/946170/equivalent-javascript-functions-for-pythons-urllib-parse-quote-and-urllib-par

8条答案

按热度按时间

wyyhbhjk1#

JavaScript               |  Python
----------------------------------- 
encodeURI(str)           |  urllib.parse.quote(str, safe='~@#$&()*!+=:;,?/\'');
-----------------------------------
encodeURIComponent(str)  |  urllib.parse.quote(str, safe='~()*!\'')

在Python 3.7+中，可以从safe=中删除~。

赞(0）回复(0）举报 2023-03-21

xdnvmnnf2#

好的，我想我将使用一组混合自定义函数：
Encode：使用encodeURIComponent（），然后将斜杠放回。
解码：解码找到的任何%hex值。
下面是我最终使用的一个更完整的变体（它也能正确处理Unicode）：

function quoteUrl(url, safe) {
    if (typeof(safe) !== 'string') {
        safe = '/';    // Don't escape slashes by default
    }

    url = encodeURIComponent(url);

    // Unescape characters that were in the safe list
    toUnencode = [  ];
    for (var i = safe.length - 1; i >= 0; --i) {
        var encoded = encodeURIComponent(safe[i]);
        if (encoded !== safe.charAt(i)) {    // Ignore safe char if it wasn't escaped
            toUnencode.push(encoded);
        }
    }

    url = url.replace(new RegExp(toUnencode.join('|'), 'ig'), decodeURIComponent);

    return url;
}

var unquoteUrl = decodeURIComponent;    // Make alias to have symmetric function names

请注意，如果在编码时不需要“安全”字符（Python中默认为'/'），则可以直接使用内置的encodeURIComponent()和decodeURIComponent()函数。
此外，如果字符串中有Unicode字符（即codepoint〉= 128的字符），那么为了保持与JavaScript的encodeURIComponent()的兼容性，Python quote_url()必须是：

def quote_url(url, safe):
    """URL-encodes a string (either str (i.e. ASCII) or unicode);
    uses de-facto UTF-8 encoding to handle Unicode codepoints in given string.
    """
    return urllib.quote(unicode(url).encode('utf-8'), safe)

unquote_url()是：

def unquote_url(url):
    """Decodes a URL that was encoded using quote_url.
    Returns a unicode instance.
    """
    return urllib.unquote(url).decode('utf-8')

赞(0）回复(0）举报 2023-03-21

vmjh9lq93#

如果您不介意额外的依赖性，requests库更受欢迎

from requests.utils import quote
quote(str)

赞(0）回复(0）举报 2023-03-21

nnt7mjpx4#

Python：Python
简体中文（zh_cn）
我没有做过广泛的测试，但就我的目的而言，它大部分时间都能工作。我猜你有一些特定的字符不工作。也许如果我使用一些亚洲文本或其他东西，它会崩溃：）
这是我在谷歌上搜索的时候出现的，所以我把这个放进了所有其他的，如果不是专门针对最初的问题的话。

赞(0）回复(0）举报 2023-03-21

eblbsuwk5#

以下是基于github repo purescript-python的实现：

import urllib.parse as urllp
def encodeURI(s): return urllp.quote(s, safe="~@#$&()*!+=:;,.?/'")
def decodeURI(s): return urllp.unquote(s, errors="strict")
def encodeURIComponent(s): return urllp.quote(s, safe="~()*!.'")
def decodeURIComponent(s): return urllp.unquote(s, errors="strict")

赞(0）回复(0）举报 2023-03-21

ghhkc1vu6#

试试正则表达式。类似这样：

mystring.replace(/[\xFF-\xFFFF]/g, "%" + "$&".charCodeAt(0));

这将用其对应的%HEX表示形式替换序数255以上的任何字符。

赞(0）回复(0）举报 2023-03-21

bjg7j2ky7#

decodeURIComponent()类似于unquote

const unquote = decodeURIComponent
const unquote_plus = (s) => decodeURIComponent(s.replace(/\+/g, ' '))

如果%后面的两个字符中有一个不是十六进制数字（或者%后面没有两个字符），JavaScript将抛出URIError: URI malformed错误，而Python将保持%不变。
encodeURIComponent()和quote不太一样，你需要对更多的字符进行百分比编码并取消转义/：

const quoteChar = (c) => '%' + c.charCodeAt(0).toString(16).padStart(2, '0').toUpperCase()
const quote = (s) => encodeURIComponent(s).replace(/[()*!']/g, quoteChar).replace(/%2F/g, '/')

const quote_plus = (s) => quote(s).replace(/%20/g, '+')

Python的quote不转义的字符在这里有记录，并列出为（在Python 3.7+上）“Letters，digits，and the characters '_.-~' are never quoted. By default，this function is intended for quoting the path section of a URL. The optional safe parameter specifies additional ASCII characters that should not be quoted - its default value is '/'”
JavaScript的encodeURIComponent不编码的字符在这里被记录下来，并被列为 uriAlpha（大写和小写ASCII字母），DecimalDigit 和 uriMark，它们是-_.!~*'()。

赞(0）回复(0）举报 2023-03-21

w9apscun8#

我在Python和JavaScript之间来回传递文本文件。
尽管urllib.parse.quote（Python端）和decodeURIComponent（JavaScript端）看起来工作正常，但它可能无法对每个字符都正确工作。
所以我写了自己的函数，应该是100%可靠的，不管文本文件中的字符是什么。
在Python端，我使用xxd对文件进行编码。xxd是一个linux实用程序，它将二进制文件转换为每个二进制字节的2个十六进制数字的字符串。将文件编码为Python十六进制代码的字符串的Python代码是：

mystring = os.popen("xxd -p "+your_file_name_here).read().replace('\n','')

在JavaScript端，此函数将十六进制代码文件恢复为原始文本字符串：

function unxxd(str){
var s=""
//get two chars at a time
  for (i=0;i<str.length;i=i+2){
    s+=String.fromCharCode(parseInt("0x"+str.substr(i,2)))
  }
  return s
}

赞(0）回复(0）举报 2023-03-21

我来回答

Python的urllib.parse.quote()和urllib.parse.unquote()的等价JavaScript函数

8条答案

相关问题

热门标签

最新问答