115 Commits

Author SHA1 Message Date
HoneyryderChuck
43016795f3 introducing the :no_proxy option
can be passed in the `:proxy` option hash, and receives domains, as
strings, which requests should not go through the proxy.
2022-08-05 22:37:52 +01:00
HoneyryderChuck
8359d6b007 Merge branch 'issue-200' into 'master'
response_cache: fixes and improvements

Closes #200

See merge request honeyryderchuck/httpx!216
2022-08-01 17:50:21 +00:00
HoneyryderChuck
61c36c4ef9 response cache: caching several instances for the same URL
by relying on vary header, this should have the effect of not
overflowinng, and doing what the user wants.
2022-08-01 18:40:21 +01:00
HoneyryderChuck
b0777c61e5 fix for loop on resolution and retry on new connection
A certain behaviour was observed, when performing some tests using the
hackernews script, where after a failed request on a non-initiated
connection, a new DNS resolution would be emitted, although the
connection still had other IPs to try on. This led to a cascading
behaviour where the DNS response would fill up the connection with the
same repeated IPs and trigger coalescing, which would loop indefinitely
after emitting the resolve event.

This was fixed by not allowing DNS resolution on already resolved names,
to propagate to connections which already contain the advertised IPs.

This seems to address the github issue 5, which description matches the
observed behaviour.
2022-07-31 19:07:41 +01:00
HoneyryderChuck
32a81f2025 fix: response cache now also takes verb into account when caching
The previous strategy was working only with URLs. This strategy would
fall flat if the same url could be used with several HTTP verbs.
2022-07-31 17:03:30 +01:00
HoneyryderChuck
6c911768fe fixing selector timeout errors closing all connections and ignoring
resolvers

All kinds of errors happening during the select loop, will be handled as
abrupt select loop errors, and terminate all connections; this also
includes timmeout errors. This is not ideal, for some reasons:
connection timeout errors happening on the loop close all connections,
although it may be only triggered for one (or a subset of) connection
for which the timeout should trigger; second, errors on the DS channel
propagate errors to connections indirectly (the emission mentioned
above), wrongly (connections for different hostnames not yet queried,
will also fail with timeout), and won't clean the resolver state (so
subsequent queries will be done for the same hostname which failed in
the first place).

This fix is a first step to solving this problem. It does not totally
address the first, but i'll fix dealing with errors from the second
use-case.
2022-06-22 02:09:26 +03:00
HoneyryderChuck
ecab7951c9 bugfix for unregistering connections timing out while resolving
A subtle bug slipped through the cracks, where if a resolve timeout
error happened, the connection would remain in the pool. Subsequent
requests to the same domain would activate it, although no requests
would go through it; the actual desired behaviour is to outright remove
it from the pool on such errors.

This was achieved by registering the "unregister_connection" callback
earlier. However, the connection accounting step would only trigger if
taking it out of the selector worked, meaning it had been registered
before.
2022-06-21 02:20:32 +03:00
HoneyryderChuck
675a2aa547 do not allow downgrading from https to http during altsvc handshake 2022-05-23 00:17:49 +01:00
HoneyryderChuck
c86f4be1a7 reworking auth APIs for a future 1.0 refactoring 2022-05-08 17:23:07 +01:00
HoneyryderChuck
be06032649 adding APIs for running other proxy auth schemes, and adapting internals to work with it 2022-05-07 18:20:07 +01:00
HoneyryderChuck
817a10a537 scoping http auth schemes out of its plugins, made them usable in proxy
In order to expose other auth schemes in proxy, the basic, digest and
ntlm modules were extracted from the plugins, these being left with the
request management. So now, an extra parameter, `:scheme`, can be
passed (it'll be "basic" for http and "socks5" for socks5 by default,
can also be "digest" or "ntlm", haven't tested those yet).
2022-05-07 13:57:10 +01:00
HoneyryderChuck
64f8ebcf51 exposing original hostname in errors when there's a dns error in a candidate name for resolution 2022-03-28 12:51:35 +01:00
HoneyryderChuck
53ee7ae225 native resolver: support resolv.conf search and ndots params 2022-03-20 02:25:39 +00:00
HoneyryderChuck
c989a14435 native resolver fix: do not signal interests when there's nothing to do (was generating bursty IO) 2022-03-06 15:40:48 +00:00
HoneyryderChuck
6f7f7933c3 proc no lambda 2022-01-17 00:59:01 +02:00
HoneyryderChuck
037994514b reworked early_resolve to work with dual-stack
for multi-backed resolvers, resolving is attempted before sending it to
    the resolver. in this way, cached, local or ip resolves get
    propagated to the proper resolver by ip family, instead of the
    previous mess.

    the system resolver doesn't do these shenanigans (trust getaddrinfo)
2022-01-17 00:56:09 +02:00
HoneyryderChuck
554957f6ca initial reimplementation of the system resolver, now using getaddrinfo
the ruby `resolver` library does everthing in ruby, and sequentially
(first ipv4 then ipv6 resolution). we already have native for that, and
getaddrinfo should be considered the ideal way to use DNS (potentially
in the future, it becomes the default resolver).
2022-01-16 22:54:56 +02:00
HoneyryderChuck
2940323412 implemented happy eyeballs v2 (rfc8305) for native and https resolver
Two resolver are kept (IPv6/IPv4) along in the pool, to which all
names are sent to and read from in the same pool. IPv4 resolves are
subject to a 50ms delay (as per rfc) before they're used for connecting.
IPv6 addresses have preference, in that if they arrive before the delay,
they are immediately used. If they arrive after the delay, they do not
interrupt the connection, but they'll be the next-in-line in case
connection handshake fails.

Two resolvers are kept, but the inherent Connection will be shared,
thereby sending name resolving requests to the same HTTP/2 connection in
bulk. The resolution delay logic from above also applies.

Currently handles resolving via `resolv` lib. This happens synchronously
though, so we're not there yet.
2022-01-16 22:54:56 +02:00
HoneyryderChuck
82b0a4bf28 fix: https resolver should close when no more outstanding connections are around 2022-01-16 22:54:56 +02:00
HoneyryderChuck
06b162b6ea applying a resolver manager to hold the different family type resolvers for the pool. This allows to have multiple resolvers per type, i.e. IPv6 and IPv4 2022-01-16 22:54:56 +02:00
HoneyryderChuck
71920157f4 fix: ensuring that the https resolver is using the pool it's being created in 2022-01-16 22:54:56 +02:00
HoneyryderChuck
6d33b5e59f adding support for tempfile sigs, other improvements 2022-01-16 22:54:56 +02:00
HoneyryderChuck
9bd73e5a22 removing uncaching of resolved names (not used anywhere) 2022-01-16 22:54:54 +02:00
HoneyryderChuck
921d1f6371 removing resolver_ios cache from pool (did not have accurate values, not really useful since selector avoid doubler registries already) 2022-01-16 22:53:36 +02:00
HoneyryderChuck
a8830681df changing resolver structure to rely on inheritance, which helps with typing 2022-01-16 22:53:36 +02:00
HoneyryderChuck
15a4fb83ba options: also freeze inner unfrozen vars; fixed message of resolve
timeout to include host
2022-01-15 01:38:14 +02:00
HoneyryderChuck
7da23ac89c enable connection coalescing for proxied connections 2022-01-08 15:09:11 +02:00
HoneyryderChuck
d3b36c5668 added support for HTTP/2 proxy by simplifying the overall http proxy implementation 2022-01-07 12:26:26 +02:00
HoneyryderChuck
5a61586fd5 do not ignore goaway with no errors sent by the server, let the connection go down
raise a specific error for this, which will make it easier to rescue
from.
2021-11-18 18:07:12 +00:00
HoneyryderChuck
b18b715818 some small rbs improvemments 2021-11-13 11:09:29 +00:00
HoneyryderChuck
a3ee98f410 implementation of the response cache plugin 2021-10-01 23:53:21 +01:00
HoneyryderChuck
bdcd4b31b5 aws_sdk plugin: removing S3 plugin APIs, replacing it with barebones aws-sdk-core components 2021-10-01 23:20:19 +01:00
HoneyryderChuck
efddd72caa removing persistent connections from the selector whe inactive
keeping them around was resulting in some busy loops on timer events
(i.e. retry after), making them unreliable, innacurate  and CPU
draining. they're now kept out whenever they're inactive.
2021-09-27 16:18:02 +01:00
HoneyryderChuck
52948e0f83 bugfix: prevent stream close callback from being called 2 times
an issue was observed when stream was closed from our side, that the
the request in-flight count on the connection. This was fixed by not
reacting to :stream_closed events if request has been previously deleted.
2021-09-23 12:12:48 +01:00
HoneyryderChuck
81a41d889c bugfix: remove connections from selector which have been unregistered
during interest calculation

A quirk was found whereby a connection which failed while connecting
(such as the badssl test) was properly unregistered from the pool, was
however kept in the selectables selector pool, because if this operation
happening during the interest calculation pool, and the var substitution
being performed right afterwards, leaving the pool and selector out of
sync and causing all sorts of miscalculations around timers later on.
2021-09-22 12:53:33 +01:00
HoneyryderChuck
13e865e488 hiding monotonic time funcs under the utils API 2021-09-20 16:16:20 +01:00
HoneyryderChuck
27d81f3090 introduce custom timer to replace Timers::Group
The HTTPX::Timers class mimicks the same top-level API as its
predecessors, but simplifies its implementation. Adding a timer will
resort all timers, while lookups are roughly the same complexity. The
key difference is that callbacks are now aggregated by interval, i.e.
different requests setting the same timeout, will reuse the same timer.
This is a more simple design than Timers::Group, which stores timers in
a binary search tree; the latter will perform well in any environment,
whereas the first one is more tailored for the use-case of httpx, where
most of the times no timers will be set, and when they do, the same
timer will be reused for all requests because they usually have the same
set of options (and therefore timeouts).
2021-09-20 13:19:55 +01:00
HoneyryderChuck
f768cf7a0e Improving API compatibility and error checking in responses
* `Response#error`, which, coupled with `ErrorResponse#error`, allows
  for `if response.error` kind of conditional;
* `Response#raise_for_status` now returns the response when no error is
  raise (for method chaining);

Closes #153
2021-09-20 13:02:20 +01:00
HoneyryderChuck
6e6c7848cc making jitter a retry plugin option 2021-09-20 12:37:48 +01:00
HoneyryderChuck
cdcbf14675 few sig changes (more assertive) 2021-08-31 13:50:29 +01:00
HoneyryderChuck
2a5c429dbd improvement in Session#request, one of the main APIs 2021-08-10 11:10:45 +01:00
HoneyryderChuck
8ded86cec6 improving sigs for new components 2021-08-10 11:10:45 +01:00
HoneyryderChuck
e1ee8c69dc proxy: fixing proxy resolve error filtering to also work with system resolver 2021-08-10 11:10:45 +01:00
HoneyryderChuck
6b61b8ccdb fixing signatures
also adding some checks on code, in order for steep to stop complaining
about potential nil returns.
2021-08-10 10:28:58 +01:00
HoneyryderChuck
f2d3c1f09b added Response#form (supports only x-www-urlencoded for now) 2021-08-09 15:54:25 +01:00
HoneyryderChuck
a85828d0d5 added Response#json 2021-08-09 15:54:24 +01:00
HoneyryderChuck
29105854e3 do not pass around options after requests get initialized, instead rely on rquest options 2021-08-05 15:03:49 +01:00
HoneyryderChuck
37f23ad8c3 cookie: force name to string; as for the value, only late-force it on writing the cookie 2021-07-16 09:52:21 +01:00
HoneyryderChuck
3ae18120d2 when users pass a Cookie header as option and plugin is enabled, the header will be parsed and managed by the jar 2021-07-16 09:52:21 +01:00
HoneyryderChuck
174f5f9647 allow cookie jar to merge 2021-07-15 15:53:15 +01:00