Jay Cliburn noticed and diagnosed a bug triggered in
dev_gso_skb_destructor() after last change from qdisc->gso_skb
to qdisc->requeue list. Since gso_segmented skbs can't be queued
to another list this patch brings back qdisc->gso_skb for them.
Reported-by: Jay Cliburn <jcliburn@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use new qdisc_root_sleeping_lock() instead of qdisc_root_lock() as
sch_tree_lock() because this lock could be used while dev is
deactivated, but we never need to use this with noop_qdisc as a root.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While passing a qdisc root lock to gen_new_estimator() and
gen_replace_estimator() dev could be deactivated or even before
grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create), so
using qdisc_root_lock() is not enough. This patch adds
qdisc_root_sleeping_lock() for this, plus additional checks, where
necessary.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_deactivate() can skip rescheduling of a qdisc by qdisc_watchdog()
or other timer calling netif_schedule() after dev_queue_deactivate().
We prevent this checking aliveness before scheduling the timer. Since
during deactivation the root qdisc is available only as qdisc_sleeping
additional accessor qdisc_root_sleeping() is created.
With feedback from Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can now kill them synchronously with all of the
previous dev_deactivate() cures.
This makes netdev destruction and shutdown saner as
the qdiscs hold references to the device.
Signed-off-by: David S. Miller <davem@davemloft.net>
This new state lets dev_deactivate() mark a qdisc as having been
deactivated.
dev_queue_xmit() and ing_filter() check for this bit and do not
try to process the qdisc if the bit is set.
dev_deactivate() polls the qdisc after setting the bit, waiting
for both __QDISC_STATE_RUNNING and __QDISC_STATE_SCHED to clear.
This isn't perfect yet, but subsequent changesets will make it so.
This part is just one piece of the puzzle.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy <kaber@trash.net> noticed that it would be nice to
handle NET_XMIT_BYPASS by NET_XMIT_SUCCESS with an internal qdisc flag
__NET_XMIT_BYPASS and to remove the mapping from dev_queue_xmit().
David Miller <davem@davemloft.net> spotted a serious bug in the first
version of this patch.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy <kaber@trash.net> noticed:
"The other problem that affects all qdiscs supporting actions is
TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS
even though the packet is not queued, corrupting upper qdiscs'
qlen counters."
and later explained:
"The reason why it translates it at all seems to be to not increase
the drops counter. Within a single qdisc this could be avoided by
other means easily, upper qdiscs would still increase the counter
when we return anything besides NET_XMIT_SUCCESS though.
This means we need a new NET_XMIT return value to indicate this to
the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN,
return that to upper qdiscs and translate it to NET_XMIT_SUCCESS
in dev_queue_xmit, similar to NET_XMIT_BYPASS."
David Miller <davem@davemloft.net> noticed:
"Maybe these NET_XMIT_* values being passed around should be a set of
bits. They could be composed of base meanings, combined with specific
attributes.
So you could say "NET_XMIT_DROP | __NET_XMIT_NO_DROP_COUNT"
The attributes get masked out by the top-level ->enqueue() caller,
such that the base meanings are the only thing that make their
way up into the stack. If it's only about communication within the
qdisc tree, let's simply code it that way."
This patch is trying to realize these ideas.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is the only legal environment in which this can be
used.
Add some commentary explaining the situation.
Signed-off-by: David S. Miller <davem@davemloft.net>
The stab bits can't be referenced uniless the full
packet scheduler layer is enabled.
Reported by Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
The u32_list is just an indirect way of maintaining a reference
to a U32 node on a per-qdisc basis.
Just add an explicit node pointer for u32 to struct Qdisc an do
away with this global list.
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to have exclusive access to the given qdisc anyways, so
doing even more locking is superfluous.
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sch_tree_lock() lock the qdisc's root. All of the
users hold the RTNL semaphore and the root qdisc is not
changing.
Implement tbf_tree_{lock,unlock}() simply in terms of
sch_tree_{lock,unlock}().
Signed-off-by: David S. Miller <davem@davemloft.net>
When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.
Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.
Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.
Signed-off-by: David S. Miller <davem@davemloft.net>
When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.
Add qdisc_root_lock() to represent this and convert the
easiest cases.
In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently it is associated with a netdev_queue, but when we have
qdisc sharing that no longer makes any sense.
Signed-off-by: David S. Miller <davem@davemloft.net>
We liberate any dangling gso_skb during qdisc destruction.
It really only matters for the root qdisc. But when qdiscs
can be shared by multiple netdev_queue objects, we can't
have the gso_skb in the netdev_queue any more.
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.
Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces. This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.
Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.
Signed-off-by: David S. Miller <davem@davemloft.net>