| Commit message (Collapse) | Author | Age |
| ... | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Create a new header, and hide the device assignment functions there.
Move struct kvm_assigned_dev_kernel to assigned-dev.c by modifying
arch/x86/kvm/iommu.c to take a PCI device struct.
Based on a patch by Radim Krcmar <rkrcmark@redhat.com>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Now that ia64 is gone, we can hide deprecated device assignment in x86.
Notable changes:
- kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl()
The easy parts were removed from generic kvm code, remaining
- kvm_iommu_(un)map_pages() would require new code to be moved
- struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |_|_|_|_|/ / / / / / /
| |/| | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | | |
ia64 does not need them anymore. Ack notifiers become x86-specific
too.
Suggested-by: Gleb Natapov <gleb@kernel.org>
Reviewed-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph updates from Sage Weil:
"The big item here is support for inline data for CephFS and for
message signatures from Zheng. There are also several bug fixes,
including interrupted flock request handling, 0-length xattrs, mksnap,
cached readdir results, and a message version compat field. Finally
there are several cleanups from Ilya, Dan, and Markus.
Note that there is another series coming soon that fixes some bugs in
the RBD 'lingering' requests, but it isn't quite ready yet"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits)
ceph: fix setting empty extended attribute
ceph: fix mksnap crash
ceph: do_sync is never initialized
libceph: fixup includes in pagelist.h
ceph: support inline data feature
ceph: flush inline version
ceph: convert inline data to normal data before data write
ceph: sync read inline data
ceph: fetch inline data when getting Fcr cap refs
ceph: use getattr request to fetch inline data
ceph: add inline data to pagecache
ceph: parse inline data in MClientReply and MClientCaps
libceph: specify position of extent operation
libceph: add CREATE osd operation support
libceph: add SETXATTR/CMPXATTR osd operations support
rbd: don't treat CEPH_OSD_OP_DELETE as extent op
ceph: remove unused stringification macros
libceph: require cephx message signature by default
ceph: introduce global empty snap context
ceph: message versioning fixes
...
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
pagelist.h needs to include linux/types.h and asm/byteorder.h and not
rely on other headers pulling yet another set of headers.
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Add a new parameter 'locked_page' to ceph_do_getattr(). If inline data
in getattr reply will be copied to the page.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Request reply and cap message can contain inline data. add inline data
to the page cache if there is Fc cap.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
allow specifying position of extent operation in multi-operations
osd request. This is required for cephfs to convert inline data to
normal data (compare xattr, then write object).
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
2 bytes of what was reserved space is now used by userspace for the
compat_version field.
Signed-off-by: John Spray <john.spray@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Use kvfree() from linux/mm.h instead, which is identical. Also fix the
ceph_buffer comment: we will allocate with kmalloc() up to 32k - the
value of PAGE_ALLOC_COSTLY_ORDER, but that really is just an
implementation detail so don't mention it at all.
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
When a lock operation is interrupted, current code sends a unlock request to
MDS to undo the lock operation. This method does not work as expected because
the unlock request can drop locks that have already been acquired.
The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR
requests to interrupt blocked file lock request. These requests do not drop
locks that have alread been acquired, they only interrupt blocked file lock
request.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace related fixes from Eric Biederman:
"As these are bug fixes almost all of thes changes are marked for
backporting to stable.
The first change (implicitly adding MNT_NODEV on remount) addresses a
regression that was created when security issues with unprivileged
remount were closed. I go on to update the remount test to make it
easy to detect if this issue reoccurs.
Then there are a handful of mount and umount related fixes.
Then half of the changes deal with the a recently discovered design
bug in the permission checks of gid_map. Unix since the beginning has
allowed setting group permissions on files to less than the user and
other permissions (aka ---rwx---rwx). As the unix permission checks
stop as soon as a group matches, and setgroups allows setting groups
that can not later be dropped, results in a situtation where it is
possible to legitimately use a group to assign fewer privileges to a
process. Which means dropping a group can increase a processes
privileges.
The fix I have adopted is that gid_map is now no longer writable
without privilege unless the new file /proc/self/setgroups has been
set to permanently disable setgroups.
The bulk of user namespace using applications even the applications
using applications using user namespaces without privilege remain
unaffected by this change. Unfortunately this ix breaks a couple user
space applications, that were relying on the problematic behavior (one
of which was tools/selftests/mount/unprivileged-remount-test.c).
To hopefully prevent needing a regression fix on top of my security
fix I rounded folks who work with the container implementations mostly
like to be affected and encouraged them to test the changes.
> So far nothing broke on my libvirt-lxc test bed. :-)
> Tested with openSUSE 13.2 and libvirt 1.2.9.
> Tested-by: Richard Weinberger <richard@nod.at>
> Tested on Fedora20 with libvirt 1.2.11, works fine.
> Tested-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
> Ok, thanks - yes, unprivileged lxc is working fine with your kernels.
> Just to be sure I was testing the right thing I also tested using
> my unprivileged nsexec testcases, and they failed on setgroup/setgid
> as now expected, and succeeded there without your patches.
> Tested-by: Serge Hallyn <serge.hallyn@ubuntu.com>
> I tested this with Sandstorm. It breaks as is and it works if I add
> the setgroups thing.
> Tested-by: Andy Lutomirski <luto@amacapital.net> # breaks things as designed :("
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Unbreak the unprivileged remount tests
userns; Correct the comment in map_write
userns: Allow setting gid_maps without privilege when setgroups is disabled
userns: Add a knob to disable setgroups on a per user namespace basis
userns: Rename id_map_mutex to userns_state_mutex
userns: Only allow the creator of the userns unprivileged mappings
userns: Check euid no fsuid when establishing an unprivileged uid mapping
userns: Don't allow unprivileged creation of gid mappings
userns: Don't allow setgroups until a gid mapping has been setablished
userns: Document what the invariant required for safe unprivileged mappings.
groups: Consolidate the setgroups permission checks
mnt: Clear mnt_expire during pivot_root
mnt: Carefully set CL_UNPRIVILEGED in clone_mnt
mnt: Move the clear of MNT_LOCKED from copy_tree to it's callers.
umount: Do not allow unmounting rootfs.
umount: Disallow unprivileged mount force
mnt: Update unprivileged remount test
mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
- Expose the knob to user space through a proc file /proc/<pid>/setgroups
A value of "deny" means the setgroups system call is disabled in the
current processes user namespace and can not be enabled in the
future in this user namespace.
A value of "allow" means the segtoups system call is enabled.
- Descendant user namespaces inherit the value of setgroups from
their parents.
- A proc file is used (instead of a sysctl) as sysctls currently do
not allow checking the permissions at open time.
- Writing to the proc file is restricted to before the gid_map
for the user namespace is set.
This ensures that disabling setgroups at a user namespace
level will never remove the ability to call setgroups
from a process that already has that ability.
A process may opt in to the setgroups disable for itself by
creating, entering and configuring a user namespace or by calling
setns on an existing user namespace with setgroups disabled.
Processes without privileges already can not call setgroups so this
is a noop. Prodcess with privilege become processes without
privilege when entering a user namespace and as with any other path
to dropping privilege they would not have the ability to call
setgroups. So this remains within the bounds of what is possible
without a knob to disable setgroups permanently in a user namespace.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
setgroups is unique in not needing a valid mapping before it can be called,
in the case of setgroups(0, NULL) which drops all supplemental groups.
The design of the user namespace assumes that CAP_SETGID can not actually
be used until a gid mapping is established. Therefore add a helper function
to see if the user namespace gid mapping has been established and call
that function in the setgroups permission check.
This is part of the fix for CVE-2014-8989, being able to drop groups
without privilege using user namespaces.
Cc: stable@vger.kernel.org
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
Today there are 3 instances of setgroups and due to an oversight their
permission checking has diverged. Add a common function so that
they may all share the same permission checking code.
This corrects the current oversight in the current permission checks
and adds a helper to avoid this in the future.
A user namespace security fix will update this new helper, shortly.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio updates from Rusty Russell:
"A balloon enhancement, and a minor race-on-module-unload theoretical
bug which doesn't merit cc: stable.
All the exciting stuff went via MST this cycle"
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
virtio_balloon: free some memory from balloon on OOM
virtio_balloon: return the amount of freed memory from leak_balloon()
virtio_blk: fix race at module removal
virtio: Fix comment typo 'CONFIG_S_FAILED'
|
| | | |_|/ / / / / / / / / / / /
| |/| | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
Without the VIRTIO_ prefix CONFIG_S_FAILED looks like a Kconfig macro.
So use that prefix here too.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management update from Zhang Rui:
"Summary:
- of-thermal extension to allow drivers to register and use its
functionality in a better way, without exploiting thermal core.
From Lukasz Majewski.
- Fix a bug in intel_soc_dts_thermal driver which calls a sleep
function in interrupt handler. From Maurice Petallo.
- add a thermal UAPI header file for exporting the thermal generic
netlink information to user-space. From Florian Fainelli.
- First round of refactoring in Exynos driver. Bartlomiej and Lukasz
are attempting to make it lean and easier to understand.
- New thermal driver for Rockchip (rk3288), with support for DT
thermal. From Caesar Wang.
- New thermal driver for Nvidia, Tegra124 SOCTHERM driver, with
support for DT thermal. From Mikko Perttunen.
- New cooling device, based on common clock framework. From Eduardo
Valentin.
- a couple of small fixes in thermal core framework. From Srinivas
Pandruvada, Javi Merino, Luis Henriques.
- Dropping Armada A375-Z1 SoC thermal support as the chip is not in
the market, armada folks decided to drop its support.
- a couple of small fixes and cleanups in int340x thermal driver"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (58 commits)
thermal: provide an UAPI header file
Thermal/int340x: Clear the error value of the last acpi_bus_get_device() call
thermal/powerclamp: add id for braswell cpu
thermal: Intel SoC DTS: Don't do thermal zone update inside spin_lock
Thermal: fix platform_no_drv_owner.cocci warnings
Thermal/int340x: avoid unnecessary pointer casting
thermal: int3403: Delete a check before thermal_zone_device_unregister()
thermal/int3400: export uuids
thermal: of: Extend current of-thermal.c code to allow setting emulated temp
thermal: of: Extend of-thermal to export table of trip points
thermal: of: Rename struct __thermal_trip to struct thermal_trip
thermal: of: Extend of-thermal.c to provide check if trip point is valid
thermal: of: Extend of-thermal.c to provide number of trip points
thermal: Fix error path in thermal_init()
thermal: lock the thermal zone when switching governors
thermal: core: ignore invalid trip temperature
thermal: armada: Remove support for A375-Z1 SoC
thermal: rockchip: add driver for thermal
dt-bindings: document Rockchip thermal
thermal: exynos: remove exynos_tmu_data.h include
...
|
| | | \ \ \ \ \ \ \ \ \ \ \ \ \ \ | |
| | | \ \ \ \ \ \ \ \ \ \ \ \ \ \ | |
| | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | |_|/ / / / / / / / / / / /
| | | |/| | | | | | | | | | | | | |
|
| | | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | |/ / / / / / / / / / / / / /
| | |/| | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal into eduardo-soc-thermal
|
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
Before this change it was only possible to set get_temp() and get_trend()
methods to be used in the common code handling passing parameters via
device tree to "cpu-thermal" CPU thermal zone device.
Now it is possible to also set emulated value of temperature for debug
purposes.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
|
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
This patch changes name of struct __thermal_trip to thermal_trip and moves
declaration of the latter to ./include/linux/thermal.h for better visibility.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
|
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
Different drivers request API extensions in of-thermal. For this reason,
additional callbacks are required to fit the new drivers needs.
The current API implementation expects the registering sensor driver
to provide a get_temp and get_trend callbacks as function parameters.
As the amount of callbacks is growing, this patch changes the existing
implementation to use a .ops field to hold all the of thermal callbacks
to sensor drivers.
This patch also changes the existing of-thermal users to fit the new
API design. No functional change is introduced in this patch.
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: devicetree@vger.kernel.org
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jean Delvare <jdelvare@suse.de>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-tegra@vger.kernel.org
Cc: lm-sensors@lm-sensors.org
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
Reviewed-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
|
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
This patch introduces a new thermal cooling device based on common clock
framework. The original motivation to write this cooling device is to be
able to cool down thermal zones using clocks that feed co-processors, such
as GPUs, DSPs, Image Processing Co-processors, etc. But it is written
in a way that it can be used on top of any clock.
The implementation is pretty straight forward. The code creates
a thermal cooling device based on a pair of a struct device and a clock name.
The struct device is assumed to be usable by the OPP layer. The OPP layer
is used as source of the list of possible frequencies. The (cpufreq) frequency
table is then used as a map from frequencies to cooling states. Cooling
states are indexes to the frequency table.
The logic sits on top of common clock framework, specifically on clock
pre notifications. Any PRE_RATE_CHANGE is hijacked, and the transition is
only allowed when the new rate is within the thermal limit (cooling state -> freq).
When a thermal cooling device state transition is requested, the clock
is also checked to verify if the current clock rate is within the new
thermal limit.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Nishanth Menon <nm@ti.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Len Brown <len.brown@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pm@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
|
| | |/ / / / / / / / / / / / / / /
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
include/linux/thermal.h contains definitions for the Thermal generic
netlink family, but none of the valuable information relevant to
user-space such as the Genl family name, multicast group, version or
command set and data types is exported to user-space.
Export all the relevant generic netlink information to user-space to
make this genl family usable by user-space, and while at it, export
THERMAL_NAME_LENGTH since it limits name length for thermal_hwmon
devices.
Kbuild and MAINTAINERS are also updated accordingly to reflect this new
file: include/uapi/linux/thermal.h.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| |_|_|_|_|_|_|_|_|_|_|_|/ / / /
|/| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Pull MTD updates from Brian Norris:
"Summary:
- Add device tree support for DoC3
- SPI NOR:
Refactoring, for better layering between spi-nor.c and its
driver users (e.g., m25p80.c)
New flash device support
Support 6-byte ID strings
- NAND:
New NAND driver for Allwinner SoC's (sunxi)
GPMI NAND: add support for raw (no ECC) access, for testing
purposes
Add ATO manufacturer ID
A few odd driver fixes
- MTD tests:
Allow testers to compensate for OOB bitflips in oobtest
Fix a torturetest regression
- nandsim: Support longer ID byte strings
And more"
* tag 'for-linus-20141215' of git://git.infradead.org/linux-mtd: (63 commits)
mtd: tests: abort torturetest on erase errors
mtd: physmap_of: fix potential NULL dereference
mtd: spi-nor: allow NULL as chip name and try to auto detect it
mtd: nand: gpmi: add raw oob access functions
mtd: nand: gpmi: add proper raw access support
mtd: nand: gpmi: add gpmi_copy_bits function
mtd: spi-nor: factor out write_enable() for erase commands
mtd: spi-nor: add support for s25fl128s
mtd: spi-nor: remove the jedec_id/ext_id
mtd: spi-nor: add id/id_len for flash_info{}
mtd: nand: correct the comment of function nand_block_isreserved()
jffs2: Drop bogus if in comment
mtd: atmel_nand: replace memcpy32_toio/memcpy32_fromio with memcpy
mtd: cafe_nand: drop duplicate .write_page implementation
mtd: m25p80: Add support for serial flash Spansion S25FL132K
MTD: m25p80: fix inconsistency in m25p_ids compared to spi_nor_ids
mtd: spi-nor: improve wait-till-ready timeout loop
mtd: delete unnecessary checks before two function calls
mtd: nand: omap: Fix NAND enumeration on 3430 LDP
mtd: nand: add ATO manufacturer info
...
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Tested with ATO AFND1G08U3, 128MiB NAND.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
read_page_raw and write_page_raw method description is not clear enough.
It clearly specifies that ECC correction should not be involved but does
not talk about specific layout (by layout I mean where in-band and
out-of-band data are stored on the NAND media) used by NAND/ECC
controllers.
Those specific layouts might impact MTD users and thus should be hidden (as
already done in the standard NAND_ECC_HW_SYNDROME implementation).
Clearly state this constraint in the nand_ecc_ctrl struct documentation.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Freescale's QorIQ T Series processors support 8 IFC chip selects
within a memory map backward compatible with previous P Series
processors which supported only 4 chip selects.
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
We don't need to expose a 'wait-till-ready' interface to drivers. Status
register polling should be handled by the core spi-nor.c library, and as
of now, I see no need to provide a special driver-specific hook for it.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
spi-nor.c should be taking care of these now.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
|
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
There is currently no useful way to override the default
implementation of this operation. The returned struct spi_device_id
must have a pointer to struct flash_info in its private data, but this
structure is defined inside spi-nor.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
|
| | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
To keep the two paths in sync
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile #2 from Al Viro:
"Next pile (and there'll be one or two more).
The large piece in this one is getting rid of /proc/*/ns/* weirdness;
among other things, it allows to (finally) make nameidata completely
opaque outside of fs/namei.c, making for easier further cleanups in
there"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coda_venus_readdir(): use file_inode()
fs/namei.c: fold link_path_walk() call into path_init()
path_init(): don't bother with LOOKUP_PARENT in argument
fs/namei.c: new helper (path_cleanup())
path_init(): store the "base" pointer to file in nameidata itself
make default ->i_fop have ->open() fail with ENXIO
make nameidata completely opaque outside of fs/namei.c
kill proc_ns completely
take the targets of /proc/*/ns/* symlinks to separate fs
bury struct proc_ns in fs/proc
copy address of proc_ns_ops into ns_common
new helpers: ns_alloc_inum/ns_free_inum
make proc_ns_operations work with struct ns_common * instead of void *
switch the rest of proc_ns_operations to working with &...->ns
netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
common object embedded into various struct ....ns
|
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
As it is, default ->i_fop has NULL ->open() (along with all other methods).
The only case where it matters is reopening (via procfs symlink) a file that
didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned
to something sane (default would fail on read/write/ioctl/etc.).
Unfortunately, such case exists - alloc_file() users, especially
anon_get_file() ones. There we have tons of opened files of very different
kinds sharing the same inode. As the result, attempt to reopen those via
procfs succeeds and you get a descriptor you can't do anything with.
Moreover, in case of sockets we set ->i_fop that will only be used
on such reopen attempts - and put a failing ->open() into it to make sure
those do not succeed.
It would be simpler to put such ->open() into default ->i_fop and leave
it unchanged both for anon inode (as we do anyway) and for socket ones. Result:
* everything going through do_dentry_open() works as it used to
* sock_no_open() kludge is gone
* attempts to reopen anon-inode files fail as they really ought to
* ditto for aio_private_file()
* ditto for perfmon - this one actually tried to imitate sock_no_open()
trick, but failed to set ->i_fop, so in the current tree reopens succeed and
yield completely useless descriptor. Intent clearly had been to fail with
-ENXIO on such reopens; now it actually does.
* everything else that used alloc_file() keeps working - it has ->i_fop
set for its inodes anyway
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
New pseudo-filesystem: nsfs. Targets of /proc/*/ns/* live there now.
It's not mountable (not even registered, so it's not in /proc/filesystems,
etc.). Files on it *are* bindable - we explicitly permit that in do_loopback().
This stuff lives in fs/nsfs.c now; proc_ns_fget() moved there as well.
get_proc_ns() is a macro now (it's simply returning ->i_private; would
have been an inline, if not for header ordering headache).
proc_ns_inode() is an ex-parrot. The interface used in procfs is
ns_get_path(path, task, ops) and ns_get_name(buf, size, task, ops).
Dentries and inodes are never hashed; a non-counting reference to dentry
is stashed in ns_common (removed by ->d_prune()) and reused by ns_get_path()
if present. See ns_get_path()/ns_prune_dentry/nsfs_evict() for details
of that mechanism.
As the result, proc_ns_follow_link() has stopped poking in nd->path.mnt;
it does nd_jump_link() on a consistent <vfsmount,dentry> pair it gets
from ns_get_path().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
a) make get_proc_ns() return a pointer to struct ns_common
b) mirror ns_ops in dentry->d_fsdata of ns dentries, so that
is_mnt_ns_file() could get away with fewer dereferences.
That way struct proc_ns becomes invisible outside of fs/proc/*.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
take struct ns_common *, for now simply wrappers around proc_{alloc,free}_inum()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
We can do that now. And kill ->inum(), while we are at it - all instances
are identical.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | |_|_|_|_|_|/ / / / / / / / / /
| | |/| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
for now - just move corresponding ->proc_inum instances over there
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
Pull nfsd updates from Bruce Fields:
"A comparatively quieter cycle for nfsd this time, but still with two
larger changes:
- RPC server scalability improvements from Jeff Layton (using RCU
instead of a spinlock to find idle threads).
- server-side NFSv4.2 ALLOCATE/DEALLOCATE support from Anna
Schumaker, enabling fallocate on new clients"
* 'for-3.19' of git://linux-nfs.org/~bfields/linux: (32 commits)
nfsd4: fix xdr4 count of server in fs_location4
nfsd4: fix xdr4 inclusion of escaped char
sunrpc/cache: convert to use string_escape_str()
sunrpc: only call test_bit once in svc_xprt_received
fs: nfsd: Fix signedness bug in compare_blob
sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt
sunrpc: convert to lockless lookup of queued server threads
sunrpc: fix potential races in pool_stats collection
sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it
sunrpc: require svc_create callers to pass in meaningful shutdown routine
sunrpc: have svc_wake_up only deal with pool 0
sunrpc: convert sp_task_pending flag to use atomic bitops
sunrpc: move rq_cachetype field to better optimize space
sunrpc: move rq_splice_ok flag into rq_flags
sunrpc: move rq_dropme flag into rq_flags
sunrpc: move rq_usedeferral flag to rq_flags
sunrpc: move rq_local field to rq_flags
sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it
nfsd: minor off by one checks in __write_versions()
sunrpc: release svc_pool_map reference when serv allocation fails
...
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
Testing has shown that the pool->sp_lock can be a bottleneck on a busy
server. Every time data is received on a socket, the server must take
that lock in order to dequeue a thread from the sp_threads list.
Address this problem by eliminating the sp_threads list (which contains
threads that are currently idle) and replacing it with a RQ_BUSY flag in
svc_rqst. This allows us to walk the sp_all_threads list under the
rcu_read_lock and find a suitable thread for the xprt by doing a
test_and_set_bit.
Note that we do still have a potential atomicity problem however with
this approach. We don't want svc_xprt_do_enqueue to set the
rqst->rq_xprt pointer unless a test_and_set_bit of RQ_BUSY returned
zero (which indicates that the thread was idle). But, by the time we
check that, the bit could be flipped by a waking thread.
To address this, we acquire a new per-rqst spinlock (rq_lock) and take
that before doing the test_and_set_bit. If that returns false, then we
can set rq_xprt and drop the spinlock. Then, when the thread wakes up,
it must set the bit under the same spinlock and can trust that if it was
already set then the rq_xprt is also properly set.
With this scheme, the case where we have an idle thread no longer needs
to take the highly contended pool->sp_lock at all, and that removes the
bottleneck.
That still leaves one issue: What of the case where we walk the whole
sp_all_threads list and don't find an idle thread? Because the search is
lockess, it's possible for the queueing to race with a thread that is
going to sleep. To address that, we queue the xprt and then search again.
If we find an idle thread at that point, we can't attach the xprt to it
directly since that might race with a different thread waking up and
finding it. All we can do is wake the idle thread back up and let it
attempt to find the now-queued xprt.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
In a later patch, we'll be removing some spinlocking around the socket
and thread queueing code in order to fix some contention problems. At
that point, the stats counters will no longer be protected by the
sp_lock.
Change the counters to atomic_long_t fields, except for the
"sockets_queued" counter which will still be manipulated under a
spinlock.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
...also make the manipulation of sp_all_threads list use RCU-friendly
functions.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|