Selenium按标记名称搜索选项

dbf7pr2w  于 2022-11-24  发布在  其他
关注(0)|答案(3)|浏览(188)

我试图从一个叫Correios的网站上获取所有数据。在这个网站上,我需要处理一些下拉菜单,我有一些问题,如:它返回一个包含一堆空字符串的列表。

chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()

dropdownEstados = driver.find_elements_by_xpath("""//*[@id="estadoAgencia"]""")

optEstados = driver.find_elements_by_tag_name("option")

for valores in optEstados:
    print(valores.text.encode())

我从中得到的是:

b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''

我怎样才能去掉空B”“?

vbopmzt1

vbopmzt11#

如果我没理解错的话,您需要找到所有这些选项:

尝试使用以下XPath表达式查找下拉列表元素:

//*[@id="estadoAgencia"]/option

代码示例:

chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()

dropdownEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']")

# Find elements in dropdown
optEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']/option")

for valores in optEstados:
    print(valores.text.encode())

通过此XPath表达式,您将获得所有下拉列表元素,除了此下拉列表中的一个元素之外,没有空字符串。输出:

b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'

注意:第一个元素是空字符串,原因如下:

k97glaaz

k97glaaz2#

您的程式码需要做一个小的变更:

dropdownEstados = driver.find_element_by_xpath("""//*[@id="estadoAgencia"]""")
 optEstados = dropdownEstados.find_elements_by_tag_name("option")

  for valores in optEstados:
     print(valores.text.encode())
bzzcjhmw

bzzcjhmw3#

要从 DropDown 的所有<options>中检索文本(idestadoAgencia),因为它是<select>标记,使用与<select>标记关联的方法会更容易和有效,您可以使用以下解决方案:

  • 代码块:
estado_select = Select(driver.find_element_by_id('estadoAgencia'))
for opt in estado_select.options:
    print(opt.get_attribute('innerHTML'))
  • 控制台输出:
ACRE
ALAGOAS
AMAPÁ
AMAZONAS
BAHIA
CEARÁ
DISTRITO FEDERAL
ESPÍRITO SANTO
GOIÁS
MARANHÃO
MINAS GERAIS
MATO GROSSO DO SUL
MATO GROSSO
PARÁ
PARAÍBA
PERNAMBUCO
PIAUÍ
PARANÁ
RIO DE JANEIRO
RIO GRANDE DO NORTE
RONDÔNIA
RORAIMA
RIO GRANDE DO SUL
SANTA CATARINA
SERGIPE
SÃO PAULO
TOCANTINS

相关问题