Ask Your Question

Revision history [back]

Scraping from a searchfield

I am trying to scrape details of State of Rio government salaries that are available on this page http:// www.consultaremuneracao.rj.gov.br/pages/welcome.jsf With scrapy and selenium I have a working script to store the results. Now I am having problems trying to construct the search queries such that I get all the results (ideally with the minimum amount of searches).

The query is "wildcarded" on both sides, such that "bel" returns isabel and bela, also the results are limited to 100. Strangely we may use wildcards such as % or _ .

Does anybody know of a good strategy for the search query with these restrictions? I have seen this strategy for searches that place a wildcard at the end but I think it will be terribly inefficient when wildcards are used also at the beginning of a query.

(removed links for lack of karma)