Skip to content

Commit 3741ac0

Browse files
Mantisusvdusek
andauthored
Apply suggestions from code review
Co-authored-by: Vlada Dusek <[email protected]>
1 parent 5d485a1 commit 3741ac0

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

docs/guides/error_handling.mdx

+4-4
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@ import HandleProxyError from '!!raw-loader!roa-loader!./code_examples/error_hand
1111
import ChangeHandleErrorStatus from '!!raw-loader!roa-loader!./code_examples/error_handling/change_handle_error_status.py';
1212
import DisableRetry from '!!raw-loader!roa-loader!./code_examples/error_handling/disable_retry.py';
1313

14-
This guide shows you how to handle common errors that happen when crawling websites.
14+
This guide demonstrates techniques for handling common errors encountered during web crawling operations.
1515

1616
## Handling proxy errors
1717

18-
Low-quality proxies can cause problems even with high settings for `max_request_retries` and `max_session_rotations` in <ApiLink to="class/BasicCrawlerOptions">BasicCrawlerOptions</ApiLink>. If you can't get data because of proxy errors, you might want to try again. You can do this using <ApiLink to="class/BasicCrawler#failed_request_handler">failed_request_handler</ApiLink>:
18+
Low-quality proxies can cause problems even with high settings for `max_request_retries` and `max_session_rotations` in <ApiLink to="class/BasicCrawlerOptions">`BasicCrawlerOptions`</ApiLink>. If you can't get data because of proxy errors, you might want to try again. You can do this using <ApiLink to="class/BasicCrawler#failed_request_handler">`failed_request_handler`</ApiLink>:
1919

2020
<RunnableCodeBlock className="language-python" language="python">
2121
{HandleProxyError}
@@ -25,7 +25,7 @@ You can use this same approach when testing different proxy providers. To better
2525

2626
## Changing how error status codes are handled
2727

28-
By default, when <ApiLink to="class/Session">Sessions</ApiLink> get status codes like [401](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/401), [403](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/403), or [429](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/429), Crawlee marks the <ApiLink to="class/Session">Session</ApiLink> as `retire` and switches to a new one. This might not be what you want, especially when working with [authentication](./logging-in-with-a-crawler). You can learn more in the [session management guide](./session-management).
28+
By default, when <ApiLink to="class/Session">`Sessions`</ApiLink> get status codes like [401](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/401), [403](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/403), or [429](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/429), Crawlee marks the <ApiLink to="class/Session">`Session`</ApiLink> as `retire` and switches to a new one. This might not be what you want, especially when working with [authentication](./logging-in-with-a-crawler). You can learn more in the [Session management guide](./session-management).
2929

3030
Here's an example of how to change this behavior:
3131

@@ -37,7 +37,7 @@ Here's an example of how to change this behavior:
3737

3838
Sometimes you might get unexpected errors when parsing data, like when a website has an unusual structure. Crawlee normally tries again based on your `max_request_retries` setting, but sometimes you don't want that.
3939

40-
Here's how to turn off retries for non-network errors using <ApiLink to="class/BasicCrawler#error_handler">error_handler</ApiLink>, which runs before `Crawlee` tries again:
40+
Here's how to turn off retries for non-network errors using <ApiLink to="class/BasicCrawler#error_handler">`error_handler`</ApiLink>, which runs before Crawlee tries again:
4141

4242
<RunnableCodeBlock className="language-python" language="python">
4343
{DisableRetry}

0 commit comments

Comments
 (0)