You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
damianr13 opened this issue
Aug 25, 2022
· 1 comment
· May be fixed by #1560
Assignees
Labels
featureIssues that represent new features or improvements to existing features.t-toolingIssues with this label are in the ownership of the tooling team.
Describe the feature
I want to be able to specify different timeouts for handlers with different labels. If I expect the website I am crawling to have two or more types of pages, I am doing different things on each of those types, so the time it takes to process a page is different.
Motivation
I am trying to crawl a category page with "infinite scroll" + "load more button" instead of pagination. Similarly to the example in the tutorial (https://crawlee.dev/docs/introduction/scraping), I have 2 types of pages: LIST and DETAIL.
Currently I am facing the problem of hitting a timeout before being able to load all the elements on the LIST page. I looked it up and I found the parameter requestHandlerTimeoutSecs that could be passed to the crawler to increase the timeout limit. My understanding is that this limit applies to all the requests regardless of their types, but I would still like to keep the limit for an individual DETAIL page lower than the high timeout value I need to specify for the LIST page.
The text was updated successfully, but these errors were encountered:
@B4nan we should do this. I literally had the same idea 1 hour ago when playing with the router middlewares. I even think we should allow the router to set different preNavigationHooks and other options.
featureIssues that represent new features or improvements to existing features.t-toolingIssues with this label are in the ownership of the tooling team.
Describe the feature
I want to be able to specify different timeouts for handlers with different labels. If I expect the website I am crawling to have two or more types of pages, I am doing different things on each of those types, so the time it takes to process a page is different.
Motivation
I am trying to crawl a category page with "infinite scroll" + "load more button" instead of pagination. Similarly to the example in the tutorial (https://crawlee.dev/docs/introduction/scraping), I have 2 types of pages: LIST and DETAIL.
Currently I am facing the problem of hitting a timeout before being able to load all the elements on the LIST page. I looked it up and I found the parameter requestHandlerTimeoutSecs that could be passed to the crawler to increase the timeout limit. My understanding is that this limit applies to all the requests regardless of their types, but I would still like to keep the limit for an individual DETAIL page lower than the high timeout value I need to specify for the LIST page.
The text was updated successfully, but these errors were encountered: