linux 确定5万个域名的可用性(和价格?)

gab6jxml  于 2024-01-06  发布在  Linux
关注(0)|答案(1)|浏览(123)

我有一个50k可能的域名列表。我想找出哪些是可用的,如果可能的话,他们的成本是多少。列表看起来像这样

presumptuous.ly
principaliti.es
procrastinat.es
productivene.ss
professional.ly
profession.ally
professorshi.ps
prognosticat.es
prohibitioni.st

字符串
我试过whois,但它运行得太慢,在未来100年内无法完成。

def check_domain(domain):
try:
    # Get the WHOIS information for the domain
    w = whois.whois(domain)
    if w.status == "free":
        return True
    else:
        return False
except Exception as e:
    print("Error: ", e)
    print(domain+" had an issue")
    return False

def check_available(matches):
    print('checking availability')
    available=[]
    for match in matches:
        if(check_domain(match)):
            print("found "+match+" available!")
            available.append(match)
    return available


我也试过names.com/names批量上传工具,但似乎根本不起作用。
如何确定这些域的可用性?

k5hmc34c

k5hmc34c1#

您可以使用例如multiprocessing包来加速该过程,即:

import os
import sys
from multiprocessing import Pool

import pandas as pd
from tqdm import tqdm
from whois import whois

# https://stackoverflow.com/a/8391735/10035985
def blockPrint():
    sys.stdout = open(os.devnull, "w")

def enablePrint():
    sys.stdout = sys.__stdout__

def check_domain(domain):
    try:
        blockPrint()
        result = whois(domain)
    except:
        return domain, None
    finally:
        enablePrint()
    return domain, result.status

if __name__ == "__main__":
    domains = [
        "google.com",
        "yahoo.com",
        "facebook.com",
        "xxxnonexistentzzz.domain",
    ] * 100

    results = []
    with Pool(processes=16) as pool:  # <-- select here how many processes do you want
        for domain, status in tqdm(
            pool.imap_unordered(check_domain, domains), total=len(domains)
        ):
            results.append((domain, not bool(status)))

    df = pd.DataFrame(results, columns=["domain", "is_free"])
    print(df.drop_duplicates())

字符串
印刷品:

100%|██████████████████████████████████████████████| 400/400 [00:07<00:00, 55.67it/s]

                      domain  is_free
0   xxxnonexistentzzz.domain     True
5               facebook.com    False
11                google.com    False
14                 yahoo.com    False


您可以看到它每秒检查约55个域。

相关问题