You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add Session binding capability via session_id in Request (#1086)
### Description
- Add strict binding of a `Request` to a specific `Session`. If the
`Session` is not available in the `SessionPool`, an error will be raised
for the `Request` which can be handled in the `failed_request_handler`.
### Issues
- Closes: #1076
### Testing
Added tests to verify functionality:
- Binding to a valid session
- Binding to a non-existent session
- Catching error in `failed_request_handler`
The <ApiLinkto="class/SessionPool">`SessionPool`</ApiLink> class provides a robust way to manage the rotation of proxy IP addresses, cookies, and other custom settings in Crawlee. Its primary advantage is the ability to filter out blocked or non-functional proxies, ensuring that your scraper avoids retrying requests through known problematic proxies.
20
22
@@ -68,3 +70,25 @@ Now, let's explore examples of how to use the <ApiLink to="class/SessionPool">`S
68
70
These examples demonstrate the basics of configuring and using the <ApiLinkto="class/SessionPool">`SessionPool`</ApiLink>.
69
71
70
72
Please, bear in mind that <ApiLinkto="class/SessionPool">`SessionPool`</ApiLink> requires some time to establish a stable pool of working IPs. During the initial setup, you may encounter errors as the pool identifies and filters out blocked or non-functional IPs. This stabilization period is expected and will improve over time.
73
+
74
+
## Configuring a single session
75
+
76
+
In some cases, you need full control over session usage. For example, when working with websites requiring authentication or initialization of certain parameters like cookies.
77
+
78
+
When working with a site that requires authentication, we typically don't want multiple sessions with different browser fingerprints or client parameters accessing the site. In this case, we need to configure the <ApiLinkto="class/SessionPool">`SessionPool`</ApiLink> appropriately:
79
+
80
+
<CodeBlocklanguage="py">
81
+
{OneSession}
82
+
</CodeBlock>
83
+
84
+
## Binding requests to specific sessions
85
+
86
+
In the previous example, there's one obvious limitation - you're restricted to only one session.
87
+
88
+
In some cases, we need to achieve the same behavior but using multiple sessions in parallel, such as authenticating with different profiles or using different proxies.
89
+
90
+
To do this, use the `session_id` parameter for the <ApiLinkto="class/Request">`Request`</ApiLink> object to bind a request to a specific session:
0 commit comments