Commit Graph

114 Commits

Author SHA1 Message Date
Sam Balana 82a5cada59 Add AfterFunc to tcpip.Clock
Changes the API of tcpip.Clock to also provide a method for scheduling and
rescheduling work after a specified duration. This change also implements the
AfterFunc method for existing implementations of tcpip.Clock.

This is the groundwork required to mock time within tests. All references to
CancellableTimer has been replaced with the tcpip.Job interface, allowing for
custom implementations of scheduling work.

This is a BREAKING CHANGE for clients that implement their own tcpip.Clock or
use tcpip.CancellableTimer. Migration plan:
 1. Add AfterFunc(d, f) to tcpip.Clock
 2. Replace references of tcpip.CancellableTimer with tcpip.Job
 3. Replace calls to tcpip.CancellableTimer#StopLocked with tcpip.Job#Cancel
 4. Replace calls to tcpip.CancellableTimer#Reset with tcpip.Job#Schedule
 5. Replace calls to tcpip.NewCancellableTimer with tcpip.NewJob.

PiperOrigin-RevId: 322906897
2020-07-23 18:00:43 -07:00
Bhasker Hariharan fef90c61c6 Fix minor bugs in a couple of interface IOCTLs.
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks
tcpdump as it tries to interpret the packets incorrectly.

Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which
fails with an EINVAL since we don't implement it. For now change it to return
EOPNOTSUPP to indicate that we don't support the query rather than return
EINVAL.

NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities
and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type
field while NIC capabilities are more like the device features which can be
queried using SIOCETHTOOL but not modified and NIC Flags are fields that can
be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc.

Updates #2746

PiperOrigin-RevId: 321436525
2020-07-15 14:15:44 -07:00
Kevin Krakauer 43c209f48e garbage collect connections
As in Linux, we must periodically clean up unused connections.

PiperOrigin-RevId: 321003353
2020-07-13 12:00:46 -07:00
Bhasker Hariharan b070e218c6 Add support for Stack level options.
Linux controls socket send/receive buffers using a few sysctl variables
  - net.core.rmem_default
  - net.core.rmem_max
  - net.core.wmem_max
  - net.core.wmem_default
  - net.ipv4.tcp_rmem
  - net.ipv4.tcp_wmem

The first 4 control the default socket buffer sizes for all sockets
raw/packet/tcp/udp and also the maximum permitted socket buffer that can be
specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...).

The last two control the TCP auto-tuning limits and override the default
specified in rmem_default/wmem_default as well as the max limits.

Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it
to limit the maximum size in setsockopt() as well as uses it for raw/udp
sockets.

This changelist introduces the other 4 and updates the udp/raw sockets to use
the newly introduced variables. The values for min/max match the current
tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is
updated to match the linux value of 212KiB up from the really low current value
of 32 KiB.

Updates #3043
Fixes #3043

PiperOrigin-RevId: 318089805
2020-06-24 10:24:20 -07:00
Ian Gudger 2141013dce Add support for SO_REUSEADDR to TCP sockets/endpoints.
For TCP sockets, SO_REUSEADDR relaxes the rules for binding addresses.

gVisor/netstack already supported a behavior similar to SO_REUSEADDR, but did
not allow disabling it. This change brings the SO_REUSEADDR behavior closer to
the behavior implemented by Linux and adds a new SO_REUSEADDR disabled
behavior. Like Linux, SO_REUSEADDR is now disabled by default.

PiperOrigin-RevId: 317984380
2020-06-23 19:15:38 -07:00
Ghanan Gowripalan 09b2fca40c Cleanup tcp.timer and tcpip.Route
When a tcp.timer or tcpip.Route is no longer used, clean up its
resources so that unused memory may be released.

PiperOrigin-RevId: 317046582
2020-06-18 00:10:05 -07:00
Ian Gudger a085e562d0 Add support for SO_REUSEADDR to UDP sockets/endpoints.
On UDP sockets, SO_REUSEADDR allows multiple sockets to bind to the same
address, but only delivers packets to the most recently bound socket. This
differs from the behavior of SO_REUSEADDR on TCP sockets. SO_REUSEADDR for TCP
sockets will likely need an almost completely independent implementation.

SO_REUSEADDR has some odd interactions with the similar SO_REUSEPORT. These
interactions are tested fairly extensively and all but one particularly odd
one (that honestly seems like a bug) behave the same on gVisor and Linux.

PiperOrigin-RevId: 315844832
2020-06-10 23:49:26 -07:00
Ghanan Gowripalan 2d3b9d18e7 Handle removed NIC in NDP timer for packet tx
NDP packets are sent periodically from NDP timers. These timers do not
hold the NIC lock when sending packets as the packet write operation
may take some time. While the lock is not held, the NIC may be removed
by some other goroutine. This change handles that scenario gracefully.

Test: stack_test.TestRemoveNICWhileHandlingRSTimer
PiperOrigin-RevId: 315524143
2020-06-09 11:33:20 -07:00
Ting-Yu Wang 41da7a568b Fix copylocks error about copying IPTables.
IPTables.connections contains a sync.RWMutex. Copying it will trigger copylocks
analysis. Tested by manually enabling nogo tests.

sync.RWMutex is added to IPTables for the additional race condition discovered.

PiperOrigin-RevId: 314817019
2020-06-05 11:29:09 -07:00
Ting-Yu Wang d3a8bffe04 Pass PacketBuffer as pointer.
Historically we've been passing PacketBuffer by shallow copying through out
the stack. Right now, this is only correct as the caller would not use
PacketBuffer after passing into the next layer in netstack.

With new buffer management effort in gVisor/netstack, PacketBuffer will
own a Buffer (to be added). Internally, both PacketBuffer and Buffer may
have pointers and shallow copying shouldn't be used.

Updates #2404.

PiperOrigin-RevId: 314610879
2020-06-03 15:00:42 -07:00
Adin Scannell 420b791a3d Minor formatting updates for gvisor.dev.
* Aggregate architecture Overview in "What is gVisor?" as it makes more sense
  in one place.

* Drop "user-space kernel" and use "application kernel". The term "user-space
  kernel" is confusing when some platform implementation do not run in
  user-space (instead running in guest ring zero).

* Clear up the relationship between the Platform page in the user guide and the
  Platform page in the architecture guide, and ensure they are cross-linked.

* Restore the call-to-action quick start link in the main page, and drop the
  GitHub link (which also appears in the top-right).

* Improve image formatting by centering all doc and blog images, and move the
  image captions to the alt text.

PiperOrigin-RevId: 311845158
2020-05-15 20:05:18 -07:00
gVisor bot cfd30665c1 iptables - filter packets using outgoing interface.
Enables commands with -o (--out-interface) for iptables rules.
$ iptables -A OUTPUT -o eth0 -j ACCEPT

PiperOrigin-RevId: 310642286
2020-05-08 15:44:54 -07:00
Nayana Bidari b660f16d18 Support for connection tracking of TCP packets.
Connection tracking is used to track packets in prerouting and
output hooks of iptables. The NAT rules modify the tuples in
connections. The connection tracking code modifies the packets by
looking at the modified tuples.
2020-05-01 16:59:40 -07:00
Ghanan Gowripalan 37a59bc76d Support IPv6 Privacy Extensions for SLAAC
Support generating temporary (short-lived) IPv6 SLAAC addresses to
address privacy concerns outlined in RFC 4941.

Tests:
- stack_test.TestAutoGenTempAddr
- stack_test.TestNoAutoGenTempAddrForLinkLocal
- stack_test.TestAutoGenTempAddrRegen
- stack_test.TestAutoGenTempAddrRegenTimerUpdates
- stack_test.TestNoAutoGenTempAddrWithoutStableAddr
- stack_test.TestAutoGenAddrInResponseToDADConflicts
PiperOrigin-RevId: 308915566
2020-04-28 16:02:44 -07:00
Bhasker Hariharan c8eeedcc1d Add support for setting TCP segment hash.
This allows the link layer endpoints to consistenly hash a TCP
segment to a single underlying queue in case a link layer endpoint
does support multiple underlying queues.

Updates #231

PiperOrigin-RevId: 302760664
2020-03-24 15:34:43 -07:00
Bhasker Hariharan 7e4073af12 Move tcpip.PacketBuffer and IPTables to stack package.
This is a precursor to be being able to build an intrusive list
of PacketBuffers for use in queuing disciplines being implemented.

Updates #2214

PiperOrigin-RevId: 302677662
2020-03-24 09:06:26 -07:00
gVisor bot d6440ec5a1 The packet forwarding should resolve the link address if necessary.
Fixes #1510

Test:
- stack_test.TestForwardingWithStaticResolver
- stack_test.TestForwardingWithFakeResolver
- stack_test.TestForwardingWithNoResolver
- stack_test.TestForwardingWithFakeResolverPartialTimeout
- stack_test.TestForwardingWithFakeResolverTwoPackets
- stack_test.TestForwardingWithFakeResolverManyPackets
- stack_test.TestForwardingWithFakeResolverManyResolutions
PiperOrigin-RevId: 300182570
2020-03-10 14:50:13 -07:00
Ian Gudger c15b8515eb Fix datarace on TransportEndpointInfo.ID and clean up semantics.
Ensures that all access to TransportEndpointInfo.ID is either:
* In a function ending in a Locked suffix.
* While holding the appropriate mutex.

This primary affects the checkV4Mapped method on affected endpoints, which has
been renamed to checkV4MappedLocked. Also document the method and change its
argument to be a value instead of a pointer which had caused some awkwardness.

This race was possible in the udp and icmp endpoints between Connect and uses
of TransportEndpointInfo.ID including in both itself and Bind.

The tcp endpoint did not suffer from this bug, but benefited from better
documentation.

Updates #357

PiperOrigin-RevId: 298682913
2020-03-03 13:42:13 -08:00
Ian Gudger c37b196455 Add support for tearing down protocol dispatchers and TIME_WAIT endpoints.
Protocol dispatchers were previously leaked. Bypassing TIME_WAIT is required to
test this change.

Also fix a race when a socket in SYN-RCVD is closed. This is also required to
test this change.

PiperOrigin-RevId: 296922548
2020-02-24 10:32:17 -08:00
Ting-Yu Wang b8f56c79be Implement tap/tun device in vfs.
PiperOrigin-RevId: 296526279
2020-02-21 15:42:56 -08:00
Ghanan Gowripalan a155a23480 Attach LinkEndpoint to NetworkDispatcher immediately
Tests: stack_test.TestAttachToLinkEndpointImmediately
PiperOrigin-RevId: 296474068
2020-02-21 11:21:23 -08:00
gVisor bot 67b615b86f Support disabling a NIC
- Disabled NICs will have their associated NDP state cleared.
- Disabled NICs will not accept incoming packets.
- Writes through a Route with a disabled NIC will return an invalid
  endpoint state error.
- stack.Stack.FindRoute will not return a route with a disabled NIC.
- NIC's Running flag will report the NIC's enabled status.

Tests:
- stack_test.TestDisableUnknownNIC
- stack_test.TestDisabledNICsNICInfoAndCheckNIC
- stack_test.TestRoutesWithDisabledNIC
- stack_test.TestRouteWritePacketWithDisabledNIC
- stack_test.TestStopStartSolicitingRouters
- stack_test.TestCleanupNDPState
- stack_test.TestAddRemoveIPv4BroadcastAddressOnNICEnableDisable
- stack_test.TestJoinLeaveAllNodesMulticastOnNICEnableDisable
PiperOrigin-RevId: 296298588
2020-02-20 14:32:49 -08:00
gVisor bot b8e22e241c Disallow duplicate NIC names.
PiperOrigin-RevId: 294500858
2020-02-11 12:59:11 -08:00
Ting-Yu Wang 665b614e4a Support RTM_NEWADDR and RTM_GETLINK in (rt)netlink.
PiperOrigin-RevId: 293271055
2020-02-04 18:05:03 -08:00
gVisor bot 5f82f092e7 Merge pull request #1558 from kevinGC:iptables-write-input-drop
PiperOrigin-RevId: 290793754
2020-01-21 12:08:52 -08:00