Commit Graph

324 Commits

Author SHA1 Message Date
Ayush Ranjan e2c70ee981 Enable automated marshalling for netstack.
PiperOrigin-RevId: 322954792
2020-07-24 01:25:39 -07:00
gVisor bot 5e7ae04766 Merge pull request #3142 from tanjianfeng:fix-3141
PiperOrigin-RevId: 322937495
2020-07-23 22:25:14 -07:00
Ayush Ranjan 6f7f739967 Marshallable socket opitons.
Socket option values are now required to implement marshal.Marshallable.

Co-authored-by: Rahat Mahmood <rahat@google.com>
PiperOrigin-RevId: 322831612
2020-07-23 11:45:10 -07:00
Kevin Krakauer bd98f82014 iptables: replace maps with arrays
For iptables users, Check() is a hot path called for every packet one or more
times. Let's avoid a bunch of map lookups.

PiperOrigin-RevId: 322678699
2020-07-22 16:23:55 -07:00
Bhasker Hariharan 71bf90c55b Support for receiving outbound packets in AF_PACKET.
Updates #173

PiperOrigin-RevId: 322665518
2020-07-22 15:33:33 -07:00
Bhasker Hariharan dcf6ddc277 Add support to return protocol in recvmsg for AF_PACKET.
Updates #173

PiperOrigin-RevId: 321690756
2020-07-16 18:40:32 -07:00
Bhasker Hariharan fef90c61c6 Fix minor bugs in a couple of interface IOCTLs.
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks
tcpdump as it tries to interpret the packets incorrectly.

Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which
fails with an EINVAL since we don't implement it. For now change it to return
EOPNOTSUPP to indicate that we don't support the query rather than return
EINVAL.

NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities
and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type
field while NIC capabilities are more like the device features which can be
queried using SIOCETHTOOL but not modified and NIC Flags are fields that can
be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc.

Updates #2746

PiperOrigin-RevId: 321436525
2020-07-15 14:15:44 -07:00
Tiwei Bie 505bebae43 hostinet: fix fd leak in fdnotifier for VFS2
When we failed to create the new socket after adding the fd to
fdnotifier, we should remove the fd from fdnotifier, because we
are going to close the fd directly.

Fixes: #3241

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2020-07-15 22:20:02 +08:00
Bhasker Hariharan 216dcebc06 Stub out SO_DETACH_FILTER.
Updates #2746

PiperOrigin-RevId: 320757963
2020-07-11 06:22:47 -07:00
gVisor bot 5df3a8fede Discard multicast UDP source address.
RFC-1122 (and others) specify that UDP should not receive
datagrams that have a source address that is a multicast address.
Packets should never be received FROM a multicast address.
See also, RFC 768:  'User Datagram Protocol'
J. Postel, ISI, 28 August 1980
  A UDP datagram received with an invalid IP source address
    (e.g., a broadcast or multicast address) must be discarded
    by UDP or by the IP layer (see rfc 1122 Section 3.2.1.3).
This CL does not address TCP or broadcast which is more complicated.

Also adds a test for both ipv6 and ipv4 UDP.

Fixes #3154

PiperOrigin-RevId: 320547674
2020-07-09 22:35:42 -07:00
Bhasker Hariharan 5946f11182 Add support for IP_HDRINCL IP option for raw sockets.
Updates #2746
Fixes #3158

PiperOrigin-RevId: 320497190
2020-07-09 16:25:57 -07:00
Jianfeng Tan 057a2666fa hostinet: fix not specifying family field
Creating sockets by hostinet with VFS2 fails due to triggerring a
seccomp violation. In essence, we fails to pass down the field of
family.

We fix this by passing down this field, family.

Fixes #3141

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2020-07-06 20:53:51 +08:00
Zach Koopmans 6a90c88b97 Port fallocate to VFS2.
PiperOrigin-RevId: 319283715
2020-07-01 13:14:44 -07:00
gVisor bot 8dbeac53ce Implement SO_NO_CHECK socket option.
SO_NO_CHECK is used to skip the UDP checksum generation on a TX socket
(UDP checksum is optional on IPv4).

Test:
 - TestNoChecksum
 - SoNoCheckOffByDefault (UdpSocketTest)
 - SoNoCheck (UdpSocketTest)

Fixes #3055

PiperOrigin-RevId: 318575215
2020-06-26 17:51:04 -07:00
Kevin Krakauer 7fb6cc286f conntrack refactor, no behavior changes
- Split connTrackForPacket into 2 functions instead of switching on flag
- Replace hash with struct keys.
- Remove prefixes where possible
- Remove unused connStatus, timeout
- Flatten ConnTrack struct a bit - some intermediate structs had no meaning
  outside of the context of their parent.
- Protect conn.tcb with a mutex
- Remove redundant error checking (e.g. when is pkt.NetworkHeader valid)
- Clarify that HandlePacket and CreateConnFor are the expected entrypoints for
  ConnTrack

PiperOrigin-RevId: 318407168
2020-06-25 21:21:57 -07:00
Bhasker Hariharan b070e218c6 Add support for Stack level options.
Linux controls socket send/receive buffers using a few sysctl variables
  - net.core.rmem_default
  - net.core.rmem_max
  - net.core.wmem_max
  - net.core.wmem_default
  - net.ipv4.tcp_rmem
  - net.ipv4.tcp_wmem

The first 4 control the default socket buffer sizes for all sockets
raw/packet/tcp/udp and also the maximum permitted socket buffer that can be
specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...).

The last two control the TCP auto-tuning limits and override the default
specified in rmem_default/wmem_default as well as the max limits.

Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it
to limit the maximum size in setsockopt() as well as uses it for raw/udp
sockets.

This changelist introduces the other 4 and updates the udp/raw sockets to use
the newly introduced variables. The values for min/max match the current
tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is
updated to match the linux value of 212KiB up from the really low current value
of 32 KiB.

Updates #3043
Fixes #3043

PiperOrigin-RevId: 318089805
2020-06-24 10:24:20 -07:00
gVisor bot d962f9f384 Implement UDP cheksum verification.
Test:
 - TestIncrementChecksumErrors

Fixes #2943

PiperOrigin-RevId: 317348158
2020-06-19 11:43:20 -07:00
Andrei Vagin 70c45e09cf socket/unix: (*connectionedEndpoint).State() has to take the endpoint lock
It accesses e.receiver which is protected by the endpoint lock.

WARNING: DATA RACE
Write at 0x00c0006aa2b8 by goroutine 189:
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect.func1()
      pkg/sentry/socket/unix/transport/connectioned.go:359 +0x50
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).BidirectionalConnect()
      pkg/sentry/socket/unix/transport/connectioned.go:327 +0xa3c
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect()
      pkg/sentry/socket/unix/transport/connectioned.go:363 +0xca
  pkg/sentry/socket/unix.(*socketOpsCommon).Connect()
      pkg/sentry/socket/unix/unix.go:420 +0x13a
  pkg/sentry/socket/unix.(*SocketOperations).Connect()
      <autogenerated>:1 +0x78
  pkg/sentry/syscalls/linux.Connect()
      pkg/sentry/syscalls/linux/sys_socket.go:286 +0x251

Previous read at 0x00c0006aa2b8 by goroutine 270:
  pkg/sentry/socket/unix/transport.(*baseEndpoint).Connected()
      pkg/sentry/socket/unix/transport/unix.go:789 +0x42
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).State()
      pkg/sentry/socket/unix/transport/connectioned.go:479 +0x2f
  pkg/sentry/socket/unix.(*socketOpsCommon).State()
      pkg/sentry/socket/unix/unix.go:714 +0xc3e
  pkg/sentry/socket/unix.(*socketOpsCommon).SendMsg()
      pkg/sentry/socket/unix/unix.go:466 +0xc44
  pkg/sentry/socket/unix.(*SocketOperations).SendMsg()
      <autogenerated>:1 +0x173
  pkg/sentry/syscalls/linux.sendTo()
      pkg/sentry/syscalls/linux/sys_socket.go:1121 +0x4c5
  pkg/sentry/syscalls/linux.SendTo()
      pkg/sentry/syscalls/linux/sys_socket.go:1134 +0x87

Reported-by: syzbot+c2be37eedc672ed59a86@syzkaller.appspotmail.com
PiperOrigin-RevId: 317236996
2020-06-18 20:28:10 -07:00
Kevin Krakauer 28b8a5cc3a iptables: remove metadata struct
Metadata was useful for debugging and safety, but enough tests exist that we
should see failures when (de)serialization is broken. It made stack
initialization more cumbersome and it's also getting in the way of ip6tables.

PiperOrigin-RevId: 317210653
2020-06-18 17:02:16 -07:00
Michael Pratt 3970c12743 Remove various uses of 'whitelist'
Updates #2972

PiperOrigin-RevId: 317113059
2020-06-18 09:03:39 -07:00
Bhasker Hariharan 07ff909e76 Support setsockopt SO_SNDBUF/SO_RCVBUF for raw/udp sockets.
Updates #173,#6
Fixes #2888

PiperOrigin-RevId: 317087652
2020-06-18 06:07:20 -07:00
Fabricio Voznika 96519e2c9d Implement POSIX locks
- Change FileDescriptionImpl Lock/UnlockPOSIX signature to
  take {start,length,whence}, so the correct offset can be
  calculated in the implementations.
- Create PosixLocker interface to make it possible to share
  the same locking code from different implementations.

Closes #1480

PiperOrigin-RevId: 316910286
2020-06-17 10:04:26 -07:00
Nayana Bidari 4b9652d63b {S,G}etsockopt for TCP_KEEPCNT option.
TCP_KEEPCNT is used to set the maximum keepalive probes to be
sent before dropping the connection.

WANT_LGTM=jchacon
PiperOrigin-RevId: 315758094
2020-06-10 13:37:27 -07:00
Andrei Vagin a5a4f80487 socket/unix: handle sendto address argument for connected sockets
In case of SOCK_SEQPACKET, it has to be ignored.
In case of SOCK_STREAM, EISCONN or EOPNOTSUPP has to be returned.

PiperOrigin-RevId: 315755972
2020-06-10 13:26:54 -07:00
Fabricio Voznika 67565078bb Implement flock(2) in VFS2
LockFD is the generic implementation that can be embedded in
FileDescriptionImpl implementations. Unique lock ID is
maintained in vfs.FileDescription and is created on demand.

Updates #1480

PiperOrigin-RevId: 315604825
2020-06-09 18:46:42 -07:00