summaryrefslogtreecommitdiff
path: root/include/net (follow)
Commit message (Collapse)AuthorAge
...
| * | | | Bluetooth: Send index information updates to monitor channelMarcel Holtmann2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Bluetooth public device address might change during controller setup and it makes it a lot simpler for monitoring tools if they just get told what the new address is. In addition include the manufacturer / company information of the controller. That allows for easy vendor specific HCI command and event handling. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
| * | | | Bluetooth: Send transport open and close monitor eventsMarcel Holtmann2015-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the core starts or shuts down the actual HCI transport, send a new monitor event that indicates that this is happening. These new events correspond to HCI_DEV_OPEN and HCI_DEV_CLOSE events. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
| * | | | Bluetooth: Introduce HCI_DEV_OPEN and HCI_DEV_CLOSE eventsMarcel Holtmann2015-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When opening the HCI transport via hdev->open send HCI_DEV_OPEN event and when closing the HCI transport via hdev->close send HCI_DEV_CLOSE. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
| * | | | mac802154: check on len instead mac_lenAlexander Aring2015-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch change the length check to len instead of mac_len for checking if the frame control field is available to dereference. We need to change it because I saw issues with af_packet raw sockets and the mrf24j40 which calls this functionality. The raw socket functionality doesn't set the mac_len but resets the skb_mac_header to skb->data which is still correct. The issue occur at mrf24j40 only, because the driver need to evaluate the fc fields. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | nl802154: add support for security layerAlexander Aring2015-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for accessing mac802154 llsec implementation over nl802154. I added for a new Kconfig entry to provide this functionality CONFIG_IEEE802154_NL802154_EXPERIMENTAL. This interface is still in development. It provides to change security parameters and add/del/dump entries of security tables. Later we can add also a get to get an entry by unique identifier. Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | netlink: add nla_get for le32 and le64Alexander Aring2015-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds missing inline wrappers for nla_get_le32 and nla_get_le64. The 802.15.4 MAC byteorder is little endian and we keep the byteorder for fields like address configuration in the same byteorder as it comes from the MAC layer. To provide these fields for nl802154 userspace applications, we need these inline wrappers for netlink. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Add hci_cmd_sync functionLoic Poulain2015-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Send a HCI command and wait for command complete event. This function serializes the requests by grabbing the req_lock. Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | Bluetooth: Add BT_WARN and bt_dev_warn logging macrosFrederic Danis2015-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add warning logging macros to bluetooth subsystem logs. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | ieee802154: change needed headroom/tailroomAlexander Aring2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleanups needed_headroom, needed_tailroom and hard_header_len fields for wpan and lowpan interfaces. For wpan interfaces the worst case mac header len should be part of needed_headroom, currently this is set as hard_header_len, but hard_header_len should be set to the minimum header length which xmit call assumes and this is the minimum frame length of 802.15.4. The hard_header_len value will check inside send callbacl of AF_PACKET raw sockets. For lowpan interfaces, if fragmentation isn't needed the skb will call dev_hard_header for 802154 layer and queue it afterwards. This happens without new skb allocation, so we need the same headroom and tailroom lengths like 802154 inside 802154 6lowpan layer. At least we assume as minimum header length an ipv6 header size. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | ieee802154: introduce wpan_dev_header_opsAlexander Aring2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current header_ops callback structure of net device are used mostly from 802.15.4 upper-layers. Because this callback structure is a very generic one, which is also used by e.g. DGRAM AF_PACKET sockets, we can't make this callback structure 802.15.4 specific which is currently is. I saw the smallest "constraint" for calling this callback with dev_hard_header/dev_parse_header by AF_PACKET which assign a 8 byte array for address void pointers. Currently 802.15.4 specific protocols like af802154 and 6LoWPAN will assign the "struct ieee802154_addr" as these parameters which is greater than 8 bytes. The current callback implementation for header_ops.create assumes always a complete "struct ieee802154_addr" which AF_PACKET can't never handled and is greater than 8 bytes. For that reason we introduce now a "generic" create/parse header_ops callback which allows handling with intra-pan extended addresses only. This allows a small use-case with AF_PACKET to send "somehow" a valid dataframe over DGRAM. To keeping the current dev_hard_header behaviour we introduce a similar callback structure "wpan_dev_header_ops" which contains 802.15.4 specific upper-layer header creation functionality, which can be called by wpan_dev_hard_header. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * | | | ieee802154: header_ops: fix frame control settingAlexander Aring2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sometimes upper-layer protocols wants to generate a new mac header by filling "struct ieee802154_hdr" only. These upper-layers sets for the address settings the source and dest fields, but not the fc fields for indicate the source and dest address mode. This patch changes the "ieee802154_hdr_push" function so the fc address fields are set according the source and dest fields of "struct ieee802154_hdr". Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* | | | | net: synack packets can be attached to request socketsEric Dumazet2015-10-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | selinux needs few changes to accommodate fact that SYNACK messages can be attached to a request socket, lacking sk_security pointer (Only syncookies are still attached to a TCP_LISTEN socket) Adds a new sk_listener() helper, and use it in selinux and sch_fq Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported by: kernel test robot <ying.huang@linux.intel.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Eric Paris <eparis@parisplace.org> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | dst: Pass net into dst->outputEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The network namespace is already passed into dst_output pass it into dst->output lwt->output and friends. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv4, ipv6: Pass net into ip_local_out and ip6_local_outEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv4, ipv6: Pass net into __ip_local_out and __ip6_local_outEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv6: Merge ip6_local_out and ip6_local_out_skEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop hidding the sk parameter with an inline helper function and make all of the callers pass it, so that it is clear what the function is doing. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv6: Merge __ip6_local_out and __ip6_local_out_skEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only __ip6_local_out_sk has callers so rename __ip6_local_out_sk __ip6_local_out and remove the previous __ip6_local_out. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv4: Merge ip_local_out and ip_local_out_skEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is confusing and silly hiding a parameter so modify all of the callers to pass in the appropriate socket or skb->sk if no socket is known. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv4: Merge __ip_local_out and __ip_local_out_skEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | dst: Pass a sk into .local_outEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For consistency with the other similar methods in the kernel pass a struct sock into the dst_ops .local_out method. Simplifying the socket passing case is needed a prequel to passing a struct net reference into .local_out. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: Pass net into dst_output and remove dst_output_okfnEric W. Biederman2015-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace dst_output_okfn with dst_output Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge tag 'mac80211-next-for-davem-2015-10-05' of ↵David S. Miller2015-10-07
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== For the current cycle, we have the following right now: * many internal fixes, API improvements, cleanups, etc. * full AP client state tracking in cfg80211/mac80211 from Ayala * VHT support (in mac80211) for mesh * some A-MSDU in A-MPDU support from Emmanuel * show current TX power to userspace (from Rafał) * support for netlink dump in vendor commands (myself) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | cfg80211: allow changing station capabilities for unassociated stationsAyala Beker2015-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, cfg80211 rejects capability updates for existing entries and as a result it's impossible to update entries that were added unassociated, but that is necessary to go through the full station states from userspace, adding a station before authentication etc. Fix this by allowing updates to capabilities for stations that the driver (or mac80211) assigned unassociated state. Drivers setting the full station state support flag must use the new station type for proper operation. Signed-off-by: Ayala Beker <ayala.beker@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | mac80211: Copy tx'ed beacons to monitor modeHelmut Schaa2015-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When debugging wireless powersave issues on the AP side it's quite helpful to see our own beacons that are transmitted by the hardware/driver. However, this is not that easy since beacons don't pass through the regular TX queues. Preferably drivers would call ieee80211_tx_status also for tx'ed beacons but that's not always possible. Hence, just send a copy of each beacon generated by ieee80211_beacon_get_tim to monitor devices when they are getting fetched by the driver. Also add a HW flag IEEE80211_HW_BEACON_TX_STATUS that can be used by drivers to indicate that they report TX status for beacons. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> (with a fix from Christian Lamparted rolled in) Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | Revert "mac80211: add pointer for driver use to key"Johannes Berg2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit f9a060f4b2003eb7350762e60dfc576447e44bad. No driver has turned up needing this functionality, and I've just implemented the functionality I wanted this for in a different way. Thus, remove it again, until somebody shows up with a need for having it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | mac80211: allow the driver to advertise A-MSDU within A-MPDU Rx supportEmmanuel Grumbach2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers may be interested in receiving A-MSDU within A-MDPU. Not all the devices may be able to do so, make it configurable. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | mac80211: allow to transmit A-MSDU within A-MPDUEmmanuel Grumbach2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Advertise the capability to send A-MSDU within A-MPDU in the AddBA request sent by mac80211. Let the driver know about the peer's capabilities. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | mac80211: introduce per vif frame registration APIAndrei Otcheretianski2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the cfg80211's frame registration api receives wdev, however mac80211 assumes per device filter configuration and ignores wdev. Per device filtering is too wasteful, especially for multi-channel devices. Introduce new per vif frame registration API and use it for probe request registrations in ieee80211_mgmt_frame_register() Also call directly to ieee80211_configure_filter instead of using a work since it is now allowed to sleep in ieee80211_mgmt_frame_register. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | nl80211: support vendor dumpit commandsJohannes Berg2015-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to transfer many items in vendor commands, support the dumpit netlink method for them. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* | | | | | net: Add source address lookup op for VRFDavid Ahern2015-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add operation to l3mdev to lookup source address for a given flow. Add support for the operation to VRF driver and convert existing IPv4 hooks to use the new lookup. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | net: Refactor path selection in __ip_route_output_key_hashDavid Ahern2015-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VRF device needs the same path selection following lookup to set source address. Rather than duplicating code, move existing code into a function that is exported to modules. Code move only; no functional change. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRCDavid Ahern2015-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | Merge branch 'master' of ↵David S. Miller2015-10-05
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/net-next Eric W. Biederman says: ==================== net: Pass net through ip fragmention This is the next installment of my work to pass struct net through the output path so the code does not need to guess how to figure out which network namespace it is in, and ultimately routes can have output devices in another network namespace. This round focuses on passing net through ip fragmentation which we seem to call from about everywhere. That is the main ip output paths, the bridge netfilter code, and openvswitch. This has to happend at once accross the tree as function pointers are involved. First some prep work is done, then ipv4 and ipv6 are converted and then temporary helper functions are removed. ==================== Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | ipv6: Pass struct net through ip6_fragmentEric W. Biederman2015-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * | | | | | ipv4: Pass struct net through ip_fragmentEric W. Biederman2015-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | | | | | ipv4: ICMP packet inspection for multipathPeter Nørlund2015-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ICMP packets are inspected to let them route together with the flow they belong to, minimizing the chance that a problematic path will affect flows on other paths, and so that anycast environments can work with ECMP. Signed-off-by: Peter Nørlund <pch@ordbogen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | ipv4: L3 hash-based multipathPeter Nørlund2015-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replaces the per-packet multipath with a hash-based multipath using source and destination address. Signed-off-by: Peter Nørlund <pch@ordbogen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | tcp: avoid two atomic ops for syncookiesEric Dumazet2015-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inet_reqsk_alloc() is used to allocate a temporary request in order to generate a SYNACK with a cookie. Then later, syncookie validation also uses a temporary request. These paths already took a reference on listener refcount, we can avoid a couple of atomic operations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | net: use sk_fullsock() in __netdev_pick_tx()Eric Dumazet2015-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SYN_RECV & TIMEWAIT sockets are not full blown, they do not have a sk_dst_cache pointer. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | inet: ip_skb_dst_mtu() should use sk_fullsock()Eric Dumazet2015-10-05
| |_|_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SYN_RECV & TIMEWAIT sockets are not full blown, do not even try to call ip_sk_use_pmtu() on them. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | tcp/dccp: add SLAB_DESTROY_BY_RCU flag for request socketsEric Dumazet2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before letting request sockets being put in TCP/DCCP regular ehash table, we need to add either : - SLAB_DESTROY_BY_RCU flag to their kmem_cache - add RCU grace period before freeing them. Since we carefully respected the SLAB_DESTROY_BY_RCU protocol like ESTABLISH and TIMEWAIT sockets, use it here. req_prot_init() being only used by TCP and DCCP, I did not add a new slab_flags into their rsk_prot, but reuse prot->slab_flags Since all reqsk_alloc() users are correctly dealing with a failure, add the __GFP_NOWARN flag to avoid traces under pressure. Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | switchdev: push object ID back to object structureJiri Pirko2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Suggested-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | switchdev: bring back switchdev_obj and use it as a generic object paramJiri Pirko2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace "void *obj" with a generic structure. Introduce couple of helpers along that. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | switchdev: rename switchdev_obj_fdb to switchdev_obj_port_fdbJiri Pirko2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the struct name in sync with object id name. Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | switchdev: rename switchdev_obj_vlan to switchdev_obj_port_vlanJiri Pirko2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the struct name in sync with object id name. Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | switchdev: rename SWITCHDEV_ATTR_* enum values to SWITCHDEV_ATTR_ID_*Jiri Pirko2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To be aligned with obj. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | switchdev: rename SWITCHDEV_OBJ_* enum values to SWITCHDEV_OBJ_ID_*Jiri Pirko2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | tcp: remove max_qlen_logEric Dumazet2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This control variable was set at first listen(fd, backlog) call, but not updated if application tried to increase or decrease backlog. It made sense at the time listener had a non resizeable hash table. Also rounding to powers of two was not very friendly. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | tcp/dccp: remove struct listen_sockEric Dumazet2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is enough to check listener sk_state, no need for an extra condition. max_qlen_log can be moved into struct request_sock_queue We can remove syn_wait_lock and the alignment it enforced. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | tcp: attach SYNACK messages to request sockets instead of listenerEric Dumazet2015-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a listen backlog is very big (to avoid syncookies), then the listener sk->sk_wmem_alloc is the main source of false sharing, as we need to touch it twice per SYNACK re-transmit and TX completion. (One SYN packet takes listener lock once, but up to 6 SYNACK are generated) By attaching the skb to the request socket, we remove this source of contention. Tested: listen(fd, 10485760); // single listener (no SO_REUSEPORT) 16 RX/TX queue NIC Sustain a SYNFLOOD attack of ~320,000 SYN per second, Sending ~1,400,000 SYNACK per second. Perf profiles now show listener spinlock being next bottleneck. 20.29% [kernel] [k] queued_spin_lock_slowpath 10.06% [kernel] [k] __inet_lookup_established 5.12% [kernel] [k] reqsk_timer_handler 3.22% [kernel] [k] get_next_timer_interrupt 3.00% [kernel] [k] tcp_make_synack 2.77% [kernel] [k] ipt_do_table 2.70% [kernel] [k] run_timer_softirq 2.50% [kernel] [k] ip_finish_output 2.04% [kernel] [k] cascade Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>