Subscribe: Dan Walsh's Blog
Added By: Feedage Forager Feedage Grade B rated
Language: English
container  content  docker  file  httpd sys  httpd  process  read  run  selinux  svirt  sys  system  unconfined  user  write 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Dan Walsh's Blog

Dan Walsh's Blog

Dan Walsh's Blog -

Last Build Date: Fri, 05 May 2017 15:00:20 GMT


SELinux and --no-new-privs and the setpriv command.

Fri, 05 May 2017 15:00:20 GMT


SELinux transitions are in some ways similar to a setuid executable in that when a transition happens the new process has different security properties then the calling process.  When you execute setuid executable, your parent process has one UID, but the child process has a different UID.

The kernel has a way for block these transitions, called --no-new-privs.  If I turn on --no-new-privs, then setuid applications like sudo, su and others will no longer work.   Ie you can not get more privileges then the parent process.

SELinux does something similar and in most cases the transition is just blocked.

For example.

We have a rule that states a httpd_t process executing a script labeled httpd_sys_script_exec_t (cgi label) will transition to httpd_sys_script_t.  But if you tell the kernel that your process tree has --no-new-privs, depending on how you wrote the policy , when the process running as httpd_t executes the httpd_sys_script_exec_t it will no longer transition, it will attempt to continue to run the script as httpd_d.

SELinux enforces that the new transition type must have a subset of the allow rules of its parent process.  But it can have no other allow rules.  IE the transitioned process can not get any NEW Privs.

This feature has not been used much in SELinux Policy, so we have found and fixed a few issues.

Container Runtimes like docker and runc have now added the --no-new-privs flag.  We have been working to make container-selinux follow the rules.

The container runtimes running container_runtime_t can start a container_t process only if container_runtime_t has all of the privileges of its parent process.

In SELinux policy you write a rule like:

typebounds container_runtime_t container_t;

This tells the compiler of policy to make sure that the container_t is a subset. If you are running the compiler in strict mode, the compiler will fail, if it is not a subsection.
If you are not running in strict mode, the compiler will silently remove any allow rules that are not in the parent, which can cause some surprises.

setpriv command

I recently heard about the setpriv command which is pretty cool for playing with kernel security features like dropping capabilities, and SELinux. One of the things you can do is execute

$ setpriv --no-new-privs sudo sh
sudo: effective uid is not 0, is sudo installed setuid root?

But if you want to try out SELinux you could combine the setpriv command and runcon together

$ setpriv --no-new-privs runcon -t container_t id -Z
$ setpriv --no-new-priv runcon -t container_t id -Z
runcon: id: Operation not permitted

This happens because container_t is not type bounds by staff_t.

Be careful relabeling volumes with Container run times. Sometimes things can go very wrong?

Fri, 07 Apr 2017 04:08:43 GMT

I recently revieved an email from someone who made the mistake of volume mounting /root into his container with the :Z option. docker run -ti -v /root:/root:Z fedora shThe container ran fine, and everything was well on his server machine until the next time he tried to ssh into the server. The sshd refused to allow him in?  What went wrong?I wrote about using volumes and SELinux on the project atomic blog.  I explain their that in order to use a volume within a non privileged container, you need to relabel the content on the volume.  You can either use the :z or the :Z option. :z will relabel with a shared label so other containers ran read and write the volume.:Z will relabel with a private label so that only this specific container can read and write the volume.I probably did not emphasize enough is that as Peter Parker (Spider Man) says: With great power comes great responsibility.Meaning you have to be careful what you relabel.  Using one of the :Z and :z options to recursively change the labels of the source content to container_file_t. When doing this you must be sure that this content is truly private to the container.  The content is not needed by other confined domains.  For example doing a volume mount of -v /var/lib/mariadb:/var/lib/mariadb:Z for a mariadb container is probably a good idea. But while doing -v /var/lib:/var/lib:Z will work, it is probably a bad idea.Back to the email, the user relabeled all of the content under /root with a label similar to system_u:object_r:container_file_t:s0:c103:c753.Later when he attempted to ssh in, the sshd daemon,  running as the sshd_t type, attempts to read content in /root/.ssh it gets permission denied, since sshd_t is not allowed to read container_file_t. The emailer realized what happened and tried to fix this situation by running restorecon -R -v /root, but this failed to change them?Why did the labels not change when he ran restorecon?There is a little known feature of restorecon called customizable_types, that I talked about 10 years ago.By default, restorecon does not change types defined in the customizable_types file.  These types can be randomly scattered around the file system, and we don't want a global relabel to change them.  This is meant to make it easier to the admin, but sometimes causes confusion.  The -F option tells restorecon to force the relabel and ignore customizable_types.restorecon -F -R /rootThis comand will reset the labels under /root to the system default and allow the emailer to login to the system via sshd again.We have safeguards built into the SELinux go bindings which prevent container runtimes relabeling of /, /etc, and /usr. I need to open a pull request to add a few more directories to help prevent users from making serious mistakes in labeling, starting with /root.[...]

Understanding SELinux Roles

Fri, 02 Dec 2016 22:03:20 GMT

I received a container bugzilla today for someone who was attempting to assign a container process to the object_r role.  Hopefully this blog will help explain how roles work with SELinux.When we describe SELinux we often concentrate on Type Enforcement, which is the most important and most used feature of SELinux.  This is what describe in the SELinux Coloring book as Dogs and Cats. We also describe MLS/MCS Separation in the coloring book.Lets look at the SELinux labelsThe SELinux labels consist of four parts, User, Role, Type and Level.  Often look something likeuser_u:role_r:type_t:levelOne area I do not cover is Roles and SELinux Users.The analogy I like to use for the Users and Roles is around Russian dolls.  In that the User controls the reachable roles and the roles control the reachable types.When we create an SELinux User, we have to specify which roles are reachable within the user.  (We also specify which levels are are available to the user.semanage user -l         Labeling   MLS/       MLS/                        SELinux User    Prefix     MCS Level  MCS Range       SELinux Rolesguest_u             user       s0         s0                             guest_rroot                    user       s0         s0-s0:c0.c1023        staff_r sysadm_r system_r unconfined_rstaff_u               user       s0         s0-s0:c0.c1023        staff_r sysadm_r system_r unconfined_rsysadm_u         user       s0         s0-s0:c0.c1023        sysadm_rsystem_u          user       s0         s0-s0:c0.c1023        system_r unconfined_runconfined_u    user       s0         s0-s0:c0.c1023        system_r unconfined_ruser_u               user       s0         s0                             user_rxguest_u           user       s0         s0                             xguest_rIn the example above you see the Staff_u user is able to reach the staff_r sysadm_r system_r unconfined_r roles, and is able to have any level in the MCS Range s0-s0:c0.c1023.Notice also the system_u user, which can reach system_r unconfined_r roles as well as the complete MCS Range.  System_u is the default user for all processes started at boot or started by s[...]

Tug of war between SELinux and Chrome Sandbox, who's right?

Mon, 31 Oct 2016 11:56:24 GMT

BackgroundOver the years, people have wanted to use SELinux to confine the web browser. The most common vulnerabilty for a desktop user is attacks caused by bugs in the browser.  A user goes to a questionable web site, and the web site has code that triggers a bug in the browser that takes over your machine.  Even if the browser has no bugs, you have to worry about helper plugins like flash-plugin, having vulnerabilities.I wrote about confineing the browser all the way back in 2008. As I explained then, confining the browser without breaking expected functionality is impossible, but we wanted to confine the "plugins".  Luckily Mozilla and Google also wanted to confine the plugins, so they broke them into separate programs which we can wrap with SELinux labels.  By default all systems running firefox or chrome the plugins are locked down by SELinux preventing them from attacking your home dir.  But sometimes we have had bugs/conflicts in this confinement.SELinux versus Chrome.We have been seeing bug reports like the following for the last couple of years.SELinux is preventing chrome-sandbox from 'write' accesses on the file oom_score_adj.I believe the chrome-sandbox is trying to tell the kernel to pick it as a candidate to be killed if the machine gets under memory pressure, and the kernel has to pick some processes to kill.But SELinux blocks this access generating an AVC that looks like:```type=AVC msg=audit(1426020354.763:876): avc:  denied  { write } for  pid=7673 comm="chrome-sandbox" name="oom_score_adj" dev="proc" ino=199187 scontext=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=file permissive=0```This SELinux AVC indicates that the chrome-sandbox running as chrome_sandbox_t is trying to write to the oom_score_adj field in its parent process, most likely the chrome browser process, running as unconfined_t.  The only files labeled as unconfined_t on a system are virtual files in the /proc file system.SELinux is probably blocking something that should be allowed, but ...We have gone back and forth on how to fix this issue.If you read the bugzillaScott R. Godin points out:according to the final comment at"Restricting further comments.As explained in and this is working as intended."so, google says 'works as intended'. *cough* (earlier discussion of this) says it's a google-chrome issue not an selinux issue.I dunno who to believe at this point.I respond:I would say it is a difference of opinion.  From Google's perspective they are  just allowing the chrome-sandbox to tell the kernel to pick chrome, when looking to kill processes on the system if they run out of memory.  But from SELinux point of view it can't tell the difference between the chrome browser and other processes on the system that are labeled as unconfined_t.  From a MAC point of view, this would allow the chrome-sandbox to pick any other user process to kill if running out of memory. It is even worse then this, allowing chrome_sandbox_t to write to files labeled as unconfined_t allows the chrome-sandbox to write to file objects under /proc of almost every process in the user session.  Allowing this access would destroy the effectiveness of the SELinux confinement on the process.If we want to control the chrome-sandbox from a MAC perspective, we don't want to allow this.  Bottom line both sides are in some ways correct.This is why you have a boolean and a dontaudit rule.  If you don't want MAC confinement of chrome sandbox then you ca[...]

docker-selinux changed to container-selinux

Tue, 04 Oct 2016 12:33:21 GMT

Changing upstream packages

I have decided to change the docker SELinux policy package on from docker-selinux to container-selinux

The main reason I did this was after seeing the following on twitter.   Docker, INC is requesting people not use docker prefix for packages on github.

Since the policy for container-selinux can be used for more container runtimes then just docker, this seems like a good idea.  I plan on using it for OCID, and would consider plugging it into the RKT CRI.

I have modified all of the types inside of the policy to container_*.  For instance docker_t is now container_runtime_t and docker_exec_t is container_runtime_exec_t.

I have taken advantage of the typealias capability of SELinux policy to allow the types to be preserved over an upgrade.

typealias container_runtime_t alias docker_t;
typealias container_runtime_exec_t alias docker_exec_t;

This means people can continue to use docker_t and docker_exec_t with tools but the kernel will automatically translate them to the primary name container_runtime_t and container_runtime_exec_t.

This policy is arriving today in rawhide in the container-selinux.rpm which obsoletes the docker-selinux.rpm.  Once we are confident about the upgrade path, we will be rolling out the new packaging to Fedora and eventually to RHEL and CentOS.

Changing the types associated with container processes.

Secondarily I have begun to change the type names for running containers.  Way back when I wrote the first policy for containers, we were using libvirt_lxc for launching containers, and we already had types defined for VMS launched out of libvirt.  VM's were labeled svirt_t.  When I decided to extend the policy for Containers I decided on extending svirt with lxc.
svirt_lxc, but I also wanted to show that it had full network.  svirt_lxc_net_t.  I labeled the content inside of the container svirt_sandbox_file_t.

Bad names...

Once containers exploded on the seen with the arrival of docker, I knew I had made a mistake choosng the types associated with container processes.  Time to clean this up.  I have submitted pull requests into selinux-policy to change these types to container_t and container_image_t.

typealias container_t alias svirt_lxc_net_t;
typealais container_image_t alias svirt_sandbox_file_t;

The old types will still work due to typealias, but I think it would become a lot easier for people to understand the SELinux types with simpler names.  There is a lot of documentation and "google" knowledge out there about svirt_lxc_net_t and svirt_sandbox_file_t, which we can modify over time.

Luckily I have a chance at a do-over.

What is the spc_t container type, and why didn't we just run as unconfined_t?

Mon, 03 Oct 2016 17:00:29 GMT

What is spc_t?

SPC stands for Super Privileged Container, which are containers that contain software used to manage the host system that the container will be running on.  Since these containers could do anything on the system and we don't want SELinux blocking any access we made spc_t an unconfined domain. 

If you are on an SELinux system, and run docker with SELinux separation turned off, the containers will run with the spc_t type.

You can disable SELinux container separation in docker in multiple different ways.

  • You don't build docker from scratch with the BUILDTAG=selinux flag.

  • You run the docker daemon without --selinux-enabled flag

  • You run a container with the --security-opt label:disable flag

          docker run -ti --security-opt label:disable fedora sh

  • You share the PID namespace or IPC namespace with the host

         docker run -ti --pid=host --ipc=host fedora sh
Note: we have to disable SELinux separation in ipc=host  and pid=host because it would block access to processes or the IPC mechanisms on the host.

Why not use unconfined_t?

The question comes up is why not just run as unconfined_t?  A lot of people falsely assume that unconfined_t is the only unconfined domains.  But unconfined_t is a user domain.   We block most confined domains from communicating with the unconfined_t domain,  since this is probably the domain that the administrator is running with.

What is different about spc_t?

First off the type docker runs as (docker_t) can transition to spc_t, it is not allowed to transition to unconfined_t. It transitions to this domain, when it executes programs located under /var/lib/docker

# sesearch -T -s docker_t | grep spc_t
   type_transition container_t docker_share_t : process spc_t;
   type_transition container_t docker_var_lib_t : process spc_t;
   type_transition container_t svirt_sandbox_file_t : process spc_t;

Secondly and most importantly confined domains are allowed to connect to unix domain sockets running as spc_t.

This means I could run as service as a container process and have it create a socket on /run on the host system and other confined domains on the host could communicate with the service.

For example if you wanted to create a container that runs sssd, and wanted to allow confined domains to be able to get passwd information from it, you could run it as spc_t and the confined login programs would be able to use it.


Some times you can create an unconfined domain that you want to allow one or more confined domains to communicate with. In this situation it is usually better to create a new domain, rather then reusing unconfined_t.

Fun with bash, or how I wasted an hour trying to debug some SELinux test scripts.

Fri, 03 Jun 2016 10:12:35 GMT

We are working to get SELinux and Overlayfs to work well together.  Currently you can not run docker containers
with SELinux on an Overlayfs back end.  You should see the patches posted to the kernel list within a week.

I have been tasked to write selinuxtestsuite tests to verify overlayfs works correctly with SELinux.
These tests will help people understand what we intended.

One of the requirements for overlayfs/SELinux is to check not only the access of the task process doing some access
but also the label of the processes that originally setup the overlayfs mount.

In order to do the test I created two process types test_overlay_mounter_t and test_overlay_client_t, and then I was using
runcon to execute a bash script in the correct context.  I added code like the following to the test to make sure that the runcon command was working.

# runcon -t test_overlay_mounter_t bash <echo "Mounting as $(id -Z)"

The problem was when I ran the tests, I saw the following:

Mounting as unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

Sadly it took me an hour to diagnose what was going on.  Writing several test scripts and running commands by hand.  Sometimes it seemed to work and other times it would not.  I thought there was a problem with runcon or with my SELinux policy.  Finally I took a break and came back to the problem realizing that the problem was with bash.  The $(id -Z) was
executed before the runcon command.

Sometimes you feel like an idiot.

runcon -t test_overlay_mounter_t bash <echo "Mounting as $(id -Z)"
echo -n "Mounting as "
id -Z
Mounting as unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
Mounting as unconfined_u:unconfined_r:test_overlay_mounter_t:s0-s0:c0.c1023

My next blog will explain how we expect overlayfs to work with SELinux.

Passing Unix Socket File Descriptors between containers processes blocked by SELinux.

Mon, 09 May 2016 16:52:34 GMT

SELinux controls passing of Socket file descriptors between processes.

A Fedora user posted a bugzilla complaining about SELinux blocking transfer of socket file descriptors between two docker containers.

Lets look at what happens when a socket file descriptor is created by a process.

When a process accepts a connection from a remote system, the file descriptor is created by a process it automatically gets assigned the same label as the process creating the socket.  For example when the docker service (docker_t) listens on /var/run/docker.sock and a client connects the docker service, the docker service end of the connection gets labeled by default with the label of the docker process.  On my machine this is:


The client is probably running as unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023.  SELinux would then check to make sure that unconfined_t is able to connecto docker_t sockets.

If this socket descriptor is passed to another process the new process label has to have access to the socket with the "socket label".  If it does not SELinux will block the transfer.

In containers, even though by default all container processes have the same SELinux typo, they have different MCS Labels.

If I have a process labeled system_u:system_r:svirt_lxc_net_t:s0:c1,c2 and I pass that file descriptor to a process in a different container labeled system_u:system_r:svirt_lxc_net_t:s0:c4,c5, SELinux will block the access.

The bug reporter was reporting that by default he was not able to pass the descriptor, which is goodness. We would not want to allow a confined container to be able to read/write socket file descriptors from another container by default.

The reporter also figured out that he could get this to work by disabling SELinux either on the host or inside of the container.

Surprisingly he also figured out if he shared IPC namespaces between the containers, SELinux would not block.

The reason for this is when you share the same IPC Namespace, docker automatically caused the containers share the Same SELinux label.  If docker did not do this SELinux would block processes from container A access to IPCs created in Container B.  With a shared IPC the SELinux labels for both of the reporters containers were the same, and SELinux allowed the passing.

How would I make two containers share the same SELinux labels?

Docker by default launches all containers with the same type field, but different MCS labels.  I told the reporter that you could cause two containers to run with the same MCS labels by using the --security-opt label:level:MCSLABEL option.

Something like this will work

docker run -it --rm --security-opt label:level:s0:c1000,c1001 --name server -v myvol:/tmp test /server
docker run -it --rm --security-opt label:level:s0:c1000,c1001 --name client -v myvol:/tmp test /client

These containers would then run with the same MCS labels, which would give the reporter the best security possible and still allow the two containers to pass the socket between containers.  These containers would still be locked down with SELInux from the host and other containers, however they would be able to attack each other from an SELinux point of view, however the other container separation security would still be in effect to prevent the attacks.

Its a good thing SELinux blocks access to the docker socket.

Fri, 08 Apr 2016 13:35:38 GMT

I have seen lots of SELinux bugs being reported where users are running a container that volume mounts the docker.sock into a container.  The container then uses a docker client to do something with docker. While I appreciate that a lot of these containers probbaly need this access, I am not sure people realize that this is equivalent to giving the container full root outside of the contaienr on the host system.  I just execute the following command and I have full root access on the host.

docker run -ti --privileged -v /:/host fedora chroot /host

SELinux definitely shows its power in this case by blocking the access.  From a security point of view, we definitely want to block all confined containers from talking to the docker.sock.  Sadly the other security mechanisms on by default in containers, do NOT block this access.  If a process somehow breaks out of a container and get write to the docker.sock, your system is pwned on an SELinux disabled system. (User Namespace, if it is enabled, will also block this access also going forward).

If you have a run a container that talks to the docker.sock you need to turn off the SELinux protection. There are two ways to do this.

You can turn off all container security separation by using the --privileged flag. Since you are giving the container full access to your system from a security point of view, you probably should just do this.

docker run --privileged -v /run/docker.sock:/run/docker.sock POWERFULLCONTAINER

If you want to just disable SELinux you can do this by using the --security-opt label:disable flag.

docker run --security-opt label:disable -v /run/docker.sock:/run/docker.sock POWERFULLCONTAINER

Note in the future if you are using User Namespace and have this problem, a new flag --userns=host flag is being
developed, which will turn off user namespace within the container.

Adding a new filename transition rule.

Tue, 05 Apr 2016 12:20:41 GMT

Way back in 2012 we added File Name Transition Rules.  These rules allows us to create content with the correct labelin a directory with a different label.  Prior to File Name Transition RUles Administrators and other tools like init scripts creating content in a directory would have to remember to execute restorecon on the new content.  In a lot of cases they would forgetand we would end up with mislabeled content, in some cases this would open up a race condition where the data would betemporarily mislabeled and could cause security problems.I recently recieved this email and figured I should write a blog.Hiya everyone. I'm an SELinux noob.I love the newish file name transition feature. I was first made aware of it some time after RHEL7 was released (, probably thanks to some mention from Simon or one of the rest of you on this list. For things that can't be watched with restorecond, this feature is so awesome.Can someone give me a quick tutorial on how I could add a custom rule? For example:filetrans_pattern(unconfined_t, httpd_sys_content_t, httpd_sys_rw_content_t, dir, "rwstorage")Of course the end goal is that if someone creates a dir named "rwstorage" in /var/www/html, that dir will automatically get the httpd_sys_rw_content_t type. Basically I'm trying to make a clone of the existing rule that does the same thing for "/var/www/html(/.*)?/uploads(/.*)?".Thanks for reading.First you need to create a source file myfiletrans.tepolicy_module(myfiletrans, 1.0)gen_require(`    type unconfined_t, httpd_sys_content_t, httpd_sys_rw_content_t;')filetrans_pattern(unconfined_t, httpd_sys_content_t, httpd_sys_rw_content_t, dir, "rwstorage")Quickly looking at the code we added.  When writing policy, if you are using type fields, unconfined_t, httpd_sys_content_t, httpd_sys_rw_content_t, that are defined in other policy packages, you need to specify this in a gen_require block.  This is similar to defining extern variables to be used in "C".  Then we call the filetrans_pattern interface.  This code tells that kernel that if a process running as unconfined_t, creating a dir named rwstorage in a directory labeled httpd_ssy_content_t, create the directory as httpd_sys_rw_content_t.Now we need to compile and install the code, note that you need to have selinux-policy-devel, package installed.make -f /usr/share/selinux/devel/Makefile myfiletrans.ppsemodule -i myfiletrans.ppLets test it out.# mkdir /var/www/html/rwstorage# ls -ldZ /var/www/html/rwstoragedrwxr-xr-x. 2 root root unconfined_u:object_r:httpd_sys_rw_content_t:s0 4096 Apr  5 08:02 /var/www/html/rwstorageLets make sure the old behaviour still works.# mkdir /var/www/html/rwstorage1# ls -lZ /var/www/html/rwstorage1 -ddrwxr-xr-x. 2 root root unconfined_u:object_r:httpd_sys_content_t:s0 4096 Apr  5 08:04 /var/www/html/rwstorage1This is an excellent way to customize your policy, if you continuously see content being created with the incorrect label.[...]

Boolean: virt_use_execmem What? Why? Why not Default?

Tue, 05 Jan 2016 14:02:01 GMT

In a recent bugzilla, the reporter was asking about what the virt_use_execmem.

  • What is it?

  • What did it allow?

  • Why was it not on by default?

What is it?

Well lets first look at the AVC

type=AVC msg=audit(1448268142.167:696): avc:  denied  { execmem } for  pid=5673 comm="qemu-system-x86" scontext=system_u:system_r:svirt_t:s0:c679,c730 tcontext=system_u:system_r:svirt_t:s0:c679,c730 tclass=process permissive=0

If you run this under audit2allow it gives you the following message:

#============= svirt_t ==============

#!!!! This avc can be allowed using the boolean 'virt_use_execmem'
allow svirt_t self:process execmem;

Setroubleshoot also tells you to turn on the virt_use_execmem boolean.

# setsebool -P virt_use_execmem 1

What does the virt_use_execmem boolean do?

# semanage boolean -l | grep virt_use_execmem
virt_use_execmem               (off  ,  off)  Allow confined virtual guests to use executable memory and executable stack

Ok what does that mean?  Uli Drepper back in 2006 added a series of memory checks to the SELInux kernel to handle common
attack vectors on programs using executable memory.    Basically these memory checks would allow us to stop a hacker from taking
over confined applications using buffer overflow attacks.

If qemu needs this access, why is this not enabled by default?

Using standard kvm vm's does not require qemu to have execmem privilege.  execmem blocks certain attack vectors 
Buffer Overflow attack where the hacked process is able overwrite memory and then execute the code the hacked 
program wrote. 

When using different qemu emulators that do not use kvm, the emulators require execmem to work.  If you look at 
the AVC above, I highlighted that the user was running qemu-system-x86.  I order for this emulator to work it
needs execmem so we have to loosen the policy slightly to allow the access.  Turning on the virt_use_execmem boolean
could allow a qemu process that is susceptible to buffer overflow attack to be hacked. SELinux would not block this

Note: lots of other SELinux blocks would still be in effect.

Since most people use kvm for VM's we disable it by default.

I a perfect world, libvirt would be changed to launch different emulators with different SELinux types, based on whether or not the emulator
requires execmem.   For example svirt_tcg_t is defined which allows this access.

Then you could run svirt_t kvm/qemus and svirt_tcg_t/qemu-system-x86 VMs on the same machine at the same time without having to lower
the security.  I am not sure if this is a common situation, and no one has done the work to make this happen.

How come MCS Confinement is not working in SELinux even in enforcing mode?

Wed, 16 Sep 2015 21:33:35 GMT

MCS separation is a key feature in sVirt technology.

We currently use it for separation of our Virtual machines using libvirt to launch vms with different MCS labels.  SELinux sandbox relies on it to separate out its sandboxes.  OpenShift relies on this technology for separating users, and now docker uses it to separate containers.  

When I discover a hammer, everything looks like a nail.

I recently saw this email.

"I have trouble understanding how MCS labels work, they are not being enforced on my RHEL7 system even though selinux is "enforcing" and the policy used is "targeted". I don't think I should be able to access those files:

$ ls -lZ /tmp/accounts-users /tmp/accounts-admin
-rw-rw-r--. backup backup guest_u:object_r:user_tmp_t:s0:c3
-rw-rw-r--. backup backup guest_u:object_r:user_tmp_t:s0:c99
backup@test ~ $ id
uid=1000(backup) gid=1000(backup) groups=1000(backup)

root@test ~ # getenforce

I can still access them even though they have different labels (c3 and
c99 as opposed to my user having c1).
backup@test ~ $ cat /tmp/accounts-users
domenico balance: -30
backup@test ~ $ cat /tmp/accounts-admin
don't lend money to domenico

Am I missing something?

MCS Is different then type enforcement.

We decided not to apply MCS Separation to every type.    We only apply it to the types that we plan on running in a Multi-Tennant way.  Basically it is for objects that we want to share the same access to the system, but not to each other.  We introduced an attribute called mcs_constrained_type.

On my Fedora Rawhide box I can look for these types:

seinfo -amcs_constrained_type -x

If you add the mcs_constrained_type attribute to a type the kernel will start enforcing MCS separation on the type.

Adding a policy like this will MCS confine guest_t

# cat myguest.te 
policy_module(mymcs, 1.0)
    type guest_t;
    attribute mcs_constrained_type;

typeattribute guest_t mcs_constrained_type;

# make -f /usr/share/selinux/devel/Makefile
# semodule -i myguest.pp

Now I want to test this out.  First i have to allow the guest_u user to use multiple MCS labels.  You would not
have to do this with non user types. 

# semanage user -m -r s0-s0:c0.c1023 guest_u

Create content to read and change it MCS label

# echo Read It > /tmp/test
# chcon -l s0:c1,c2 /tmp/test
# ls -Z /tmp/test
unconfined_u:object_r:user_tmp_t:s0:c1,c2 /tmp/test

Now login as a guest user

# id -Z
# cat /tmp/test
Read It

Now login as a guest user with a different MCS type

# id -Z
# cat /tmp/test
cat: /tmp/test: Permission denied

libselinux is a liar!!!

Sun, 13 Sep 2015 11:27:20 GMT

On an SELinux enabled machine, why does getenforce in a docker container say it is disabled?

SELinux is not namespaced

This means that there is only one SELinux rules base for all containers on a system.  When we attempt to confine containers we want to prevent them from writing to kernel file systems, which might be one mechanism for escape.  One of those file systems would be /proc/fs/selinux, and we also want to control there access to things like /proc/self/attr/* field.

By default Docker processes run as svirt_lxc_net_t and they are prevented from doing (almost) all SELinux operations.  But processes within containers do not know that they are running within a container.  SELinux aware applications are going to attempt to do SELinux operations, especially if they are running as root.

For example,  if you are running yum/dnf/rpm inside of a docker build container and the tools sees that SELinux is enabled, the tool is going to attempt to set labels on the file system, if SELinux blocks the setting of these file labels these calls will fail causing the tool will fail and exit.  Because of it SELinux aware applications within containers would mostly fail.

Libselinux is a liar

We obviously do not want  these apps failing,  so we decided to make libselinux lie to the processes.  libselinux checks if /proc/fs/selinux is mounted onto the system and whether it is mounted read/write.  If /proc/fs/selinux not mounted read/write, libselinux will report to calling applications that SELinux is disabled.  In containers we don't mount these file systems by default or we mount it read/only causing libselinux to report that it is disabled.

# getenforce
# docker run --rm fedora id -Z
id: --context (-Z) works only on an SELinux-enabled kernel

# docker run --rm -v /sys/fs/selinux:/sys/fs/selinux:ro fedora id -Z
id: --context (-Z) works only on an SELinux-enabled kernel
# docker run --rm -v /sys/fs/selinux:/sys/fs/selinux fedora id -Z

When SELinux aware applications like yum/dnf/rpm see SELinux is disabled, they stop trying to do SELinux operations, and succeed within containers.

Applications work well even though SELinux is very much enforcing, and controlling their activity.

I believe that SELInux is the best tool we currently use to make Contaieners actually contain.

In this case SELinux disabled does not make me cry. 

nsenter gains SELinux support

Thu, 27 Aug 2015 11:44:35 GMT

nsenter is a program that allows you to run program with namespaces of other processes

This tool is often used to enter containers like docker, systemd-nspawn or rocket.   It can be used for debugging or for scripting
tools to work inside of containers.  One problem that it had was the process that would be entering the container could potentially
be attacked by processes within the container.   From an SELinux point of view, you might be injecting an unconfined_t process
into a container that is running as svirt_lxc_net_t.  We wanted a way to change the process context when it entered the container
to match the pid of the process who's namespaces you are entering.

As of util-linux-2.27, nsenter now has this support.

man nsenter
       -Z, --follow-context
              Set the SELinux  security  context  used  for  executing  a  new process according to already running process specified by --tar‐get PID. (The util-linux has to be compiled with SELinux support otherwise the option is unavailable.)

docker exec

Already did this but this gives debuggers, testers, scriptors a new tool to use with namespaces and containers.

'CVE-2015-4495 and SELinux', Or why doesn't SELinux confine Firefox?

Tue, 11 Aug 2015 15:42:10 GMT

Why don't we confine Firefox with SELinux?That is one of the most often asked questions, especially after a new CVE like CVE-2015-4495, shows up.  This vulnerability in firefox allows a remote session to grab any files in your home directory.  If you can read the file then firefox can read it and send it back to the website that infected your browser.The big problem with confining desktop applications is the way the desktop has been designed. I wrote about confining the desktop several years ago.  As I explained then the problem is applications are allowed to communicate with each other in lots of different ways. Here are just a few.*   X Windows.  All apps need full access to the X Server. I tried several years ago to block applications access to the keyboard settings, in order to block keystroke logging, (google xspy).  I was able to get it to work but a lot of applications started to break.  Other access that you would want to block in X would be screen capture, access to the cut/paste buffer. But blockingthese would cause too much breakage on the system.  XAce was an attempt to add MAC controls to X and is used in MLS environments but I believe it causes to much breakage.*   File system access.  Users expect firefox to be able to upload and download files anywhere they want on the desktop.  If I was czar of the OS, I could state that upload files must go into ~/Upload and Download files go into ~/Download, but then users would want to upload photos from ~/Photos.  Or to create their own random directories.  Blocking access to any particular directory including .ssh would be difficult, since someone probably has a web based ssh session or some other tool that can use ssh public key to authenticate.  (This is the biggest weakness in described in CVE-2015-4495*   Dbus communications as well as gnome shell, shared memory, Kernel Keyring, Access to the camera, and microphone ...Every one expects all of these to just work, so blocking these with MAC tools and SELinux is most likely to lead to "setenforce 0" then actually adding a lot of security.Helper Applications.One of the biggest problems with confining a browser, is helper applications.  Lets imagine I ran firefox with SELinux type firefox_t.  The user clicks on a .odf file or a .doc file, the browser downloads the file and launches LibreOffice so the usercan view the file.  Should LibreOffice run as LibreOffice_t or firefox_t?  If it runs as LibreOffice_t then if the LibreOffice_t app was looking at a different document, the content might be able to subvert the process.  If I run the LibreOffice as firefox_t, what happens when the user launched a document off of his desktop, it will not launch a new LibreOffice it will just communicate with the running LibreOffice and launch the document, making it accessible to firefox_t.Confining Plugins.For several years now we have been confining plugins with SELinux in Firefox and Chrome.  This prevents tools like flashpluginfrom having much access to the desktop.  But we have had to add booleans to turn off the confinement, since certain plugins, end up wanting more access.mozilla_plugin_bind_unreserved_ports --> offmozilla_plugin_can_network_connect --> offmozilla_plugin_use_bluejeans --> offmozilla_plugin_use_gps --> offmozilla_plugin_use_spice --> offunconfined_mozilla_plugin_transition --> onSELinux SandboxI did introduce the SELinux Sandbox a few years ago.The SELinux sandbox would allow you to confine desktop applications using container technologies including SELinux.  You could run firefox, LibreOffi[...]

To exec or transition that is the question...

Thu, 30 Jul 2015 14:07:40 GMT

I recently recieved a question on writing policy via linkedin.

Hi, Dan -

I am working on SELinux right now and I know you are an expert on it.. I believe you can give me a help. Now in my policy, I did in myadm policy like
require { ...; type ping_exec_t; ...;class dir {...}; class file {...}; }

allow myadm_t ping_exec_t:file { execute execute_no_trans };

Seems the ping is not work, I got error
ping: icmp open socket: Permission denied

Any ideas?

My response:

When running another program there are two things that can happen:
1. You can either execute the program in the current context (Which is what  you did)
This means that myadm_t needs to have all of the permissions of ping.

2. You can transition to the executables domain  (ping_t)

We usually use interfaces for this.



I think if you looked at your AVC's you would probbaly see something about myadm_t needing the net_raw capability.

sesearch -A -s ping_t -c capability
Found 1 semantic av rules:
   allow ping_t ping_t : capability { setuid net_raw } ;

net_raw access allows ping_t to create and send icmp packets.  You could add that to myadm_t, but that would allow it
to listen at a low level to network traffic, which might not be something you want.  Transitioning is probably better.


Transitioning could cause other problems, like leaked file descriptors or bash redirection.  For example if you do a
ping > /tmp/mydata, then you might have to add rules to ping_t to be allowed to write to the label of /tmp/mydata.

It is your choice about which way to go.

I usually transition if their is a lot of access needed, but if their is only a limited access, that I deem not too risky, I
exec and add the additional access to the current domain.

I get a SYS_PTRACE AVC when my utility runs ps, how come?

Mon, 29 Jun 2015 10:54:54 GMT

We often get random SYS_PTRACE AVCs, usually when an application is running the ps command or reading content in /proc.

type=AVC msg=audit(1426354432.990:29008): avc:  denied  { sys_ptrace } for  pid=14391 comm="ps" capability=19  scontext=unconfined_u:unconfined_r:mozilla_plugin_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:mozilla_plugin_t:s0-s0:c0.c1023 tclass=capability permissive=0

sys_ptrace usually indicates that one process is trying to look at the memory of another process with a different UID.


man capabilites

              *  Trace arbitrary processes using ptrace(2);
              *  apply get_robust_list(2) to arbitrary processes;
              *  transfer data to or from the memory  of  arbitrary  processes
                 using process_vm_readv(2) and process_vm_writev(2).
              *  inspect processes using kcmp(2).

These types of access should probably be dontaudited. 

Running the ps command was a privileged process can cause sys_ptrace to happen.  

There is special data under /proc that a privileged process would access by running the ps command,  

This data is almost never actually needed by the process running ps, the data is used by debugging tools 
to see where some of the randomized memory of a process is setup.  

Easiest thing for policy writers to do is to dontaudit the access.

How do files get mislabled?

Mon, 29 Jun 2015 10:38:13 GMT

Sometimes we close bugs as CLOSED_NOT_A_BUG, because of a file being mislabeled, we then tell the user to just run restorecon on the object.

But this leaves the user with the question,

How did the file get mislabeled?

They did not run the machin in permissive mode or disable SELinux, but still stuff became mislabeled?  How come?

The most often case of this in my experience is the mv command, when users mv files around their system the mv command maintains the security contenxt of the src object.

sudo mv ~/mydir/index.html /var/www/html

This ends up with a file labeled user_home_t in the /var/www/html, rather then http_sys_content_t, and apache process is not allowed to read it.  If you use mv -Z on newer SELinux systems, it will change the context to the default for the target directory.

Another common cause is debugging a service or running a service by hand.

This bug report is a potential example.

Sometimes we see content under /run (/var/run) which is labeled var_run_t, it should have been labeled something specific to the domain that created it , like apmd_var_run_t.
The most likely cause of this, is that the object was created by an unconfined domain like unconfined_t.  Basically an unconfined domain creates the object based on the parent directory, which would label it as var_run_t.

I would guess that the user/admin ran the daemon directly rather then through the init script.

# /usr/bin/acpid
#gdb /usr/bin/acpid

When acpid created the /run/acpid.socket then the object would be mislableed.  Later when the user runs the service through the init system it would get run with the correct type (apmd_t) and would be denied from deleting the file.

type=AVC msg=audit(1418942223.880:4617): avc:  denied  { unlink } for  pid=24444 comm="acpid" name="acpid.socket" dev="tmpfs" ino=2550865 scontext=system_u:system_r:apmd_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=sock_file permissive=0

Sadly their is not much we can do to prevent this type of mislabeled file from being created, and end up having to tell the user to run restorecon.

Is SELinux good anti-venom?

Tue, 19 May 2015 20:00:56 GMT

SELinux to the Rescue If you have been following the news lately you might have heard of the "Venom" vulnerabilty. Researchers found a bug in Qemu process, which is used to run virtual machines on top of KVM based linux machines. Red Hat, Centos and Fedora systems were potentially vulnerable. Updated packages have been released for all platforms to fix the problem. But we use SELinux to prevent virtual machines from attacking other virtual machines or the host. SELinux protection on VM's is often called sVirt. We run all virtual machines with the svirt_t type. We also use MCS Separation to isolate one VM from other VMs and thier images on the system. While to the best of my knowlege no one has developed an actual hack to break out of the virtualization layer, I do wonder whether or not the break out would even be allowed by SELinux. SELinux has protections against executable memory, which is usually used for buffer overflow attacks. These are the execmem, execheap and execstack access controls. There is a decent chance that these would have blocked the attack. # sesearch -A -s svirt_t -t svirt_t -c process -C Found 2 semantic av rules: allow svirt_t svirt_t : process { fork sigchld sigkill sigstop signull signal getsched setsched getsession getcap getattr setrlimit } ; DT allow svirt_t svirt_t : process { execmem execstack } ; [ virt_use_execmem ] Examining the policy on my Fedora 22 machine, we can look at the types that a svirt_t process would be allowed to write. These are the types that SELinux would allow the process to write, if they had matching MCS labels, or s0. # sesearch -A -s svirt_t -c file -p write -C | grep open allow virt_domain qemu_var_run_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; allow virt_domain svirt_home_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; allow virt_domain svirt_tmp_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; allow virt_domain svirt_image_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; allow virt_domain svirt_tmpfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; allow virt_domain virt_cache_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; DT allow virt_domain fusefs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_fusefs ] DT allow virt_domain cifs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_samba ] ET allow virt_domain dosfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_usb ] DT allow virt_domain nfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_nfs ] ET allow virt_domain usbfs_t : file { ioctl read write getattr lock append open } ; [ virt_use_usb ] Lines beginning with the D are disabled, and only enabled by toggling the boolean. I did a video showing the access avialable to an OpenShift process running as root on your system using the same technology. Click here to view. SELinux also blocks capabities, so the qemu process even if running as root would only have the net_bind_service capabilty, which allows it to bind to ports < 1024. # sesearch -A -s svirt_t -c capability -C Found 1 semantic av rules: allow svirt_t svirt_t : capability net_bind_servi[...]

A follow up to the Bash Exploit and SELinux.

Fri, 26 Sep 2014 12:58:19 GMT

One of the advantages of a remote exploit is to be able to setup and launch attacks on other machines.

I wondered if it would be possible to setup a bot net attack using the remote attach on an apache server with the bash exploit.

Looking at my rawhide machine's policy

sesearch -A -s httpd_sys_script_t -p name_connect -C | grep -v ^D
Found 24 semantic av rules:
   allow nsswitch_domain dns_port_t : tcp_socket { recv_msg send_msg name_connect } ;
   allow nsswitch_domain dnssec_port_t : tcp_socket name_connect ;
ET allow nsswitch_domain ldap_port_t : tcp_socket { recv_msg send_msg name_connect } ; [ authlogin_nsswitch_use_ldap ]

The apache script would only be allowed to connect/attack a dns server and an LDAP server.  It would not be allowed to become a spam bot (No connection to mail ports) or even attack other web service.

Could an attacker leave a back door to be later connected to even after the bash exploit is fixed?

# sesearch -A -s httpd_sys_script_t -p name_bind -C | grep -v ^D

Nope!  On my box the httpd_sys_script_t process is not allowed to listen on any network ports.

I guess the crackers will just have to find a machine with SELinux disabled.

What does SELinux do to contain the the bash exploit?

Thu, 25 Sep 2014 21:37:39 GMT

Do you have SELinux enabled on your Web Server? Lots of people are asking me about SELinux and the Bash Exploit.I did a quick analysis on one reported remote Apache exploit: Shows an example of the bash exploit on an apache server. It even shows that SELinux was enforcing when the exploit happened.SELinux does not block the exploit but it would prevent escallation of confined domains. Why didn't SELinux block it? SELinux controls processes based on their types, if the process is doing what it was designed to do then SELinux will not block it. In the defined exploit the apache server is running as httpd_t and it is executing a cgi script which would be labeled httpd_sys_script_exec_t. When httpd_t executes a script labeled httpd_sys_script_exec_t SELinux will transition the new process to httpd_sys_script_t. SELinux policy allowd processes running as httpd_sys_script_t is to write to /tmp, so it was successfull in creating /tmp/aa. If you did this and looked at the content in /tmp it would be labeled httpd_tmp_t httpd_tmp_t. Lets look at which files httpd_sys_script_t is allowed to write to on my Rawhide box. # sesearch -A -s httpd_sys_script_t -c file -p write -C | grep open | grep -v ^D allow httpd_sys_script_t httpd_sys_rw_content_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; allow httpd_sys_script_t anon_inodefs_t : file { ioctl read write getattr lock append open } ; allow httpd_sys_script_t httpd_sys_script_t : file { ioctl read write getattr lock append open } ; allow httpd_sys_script_t httpd_tmp_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; httpd_sys_script_t is a process label which only applies to content in /proc. This means processes running as httpd_sys_script_t can write to there process data. anon_inodefs_t is an in memory label, most likely not on your disk. The only on disk places it can write files labeled httpd_sys_rw_content_t and /tmp. grep httpd_sys_rw_content_t /etc/selinux/targeted/contexts/files/file_contexts or on my box # find /etc -context "*:httpd_sys_rw_content_t:*" /etc/BackupPC /etc/BackupPC/ /etc/BackupPC/hosts /etc/glpi With SELinux disabled, this hacked process would be allowed to write any content that is world writable on your system as well as any content owned by the apache user or group. Lets look at what it can read. sesearch -A -s httpd_sys_script_t -c file -p read -C | grep open | grep -v ^D | grep -v exec_t allow domain locale_t : file { ioctl read getattr lock open } ; allow httpd_sys_script_t iso9660_t : file { ioctl read getattr lock open } ; allow httpd_sys_script_t httpd_sys_ra_content_t : file { ioctl read create getattr lock append open } ; allow httpd_sys_script_t httpd_sys_rw_content_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; allow httpd_sys_script_t squirrelmail_spool_t : file { ioctl read getattr lock open } ; allow domain ld_so_t : file { ioctl read getattr execute open } ; allow httpd_sys_script_t anon_inodefs_t : file { ioctl read write getattr lock append open } ; allow httpd_sys_script_t sysctl_kernel_t : file { ioctl read getattr lock open } ; allow domain base_ro_file_type : file { ioctl read getattr lock open } ; allow httpd_sys_script_t httpd_sys_script_t : file { ioctl read write getattr lock app[...]

Confusion with sesearch.

Mon, 15 Sep 2014 11:07:47 GMT

I just saw an email where a user was asking why sesearch is showing access but the access is still getting denied.

I'm running CentOS 6. I've httpd running which accesses a file but it results in access denied with the following --

type=AVC msg=audit(1410680693.979:40): avc:  denied  { read } for pid=987 comm="httpd" name="README.txt" dev=dm-0 ino=12573 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file


sesearch -A | grep 'allow httpd_t' | grep ': file' | grep user_home_t
   allow httpd_t user_home_t : file { ioctl read getattr lock open } ;
   allow httpd_t user_home_t : file { ioctl read getattr lock open } ;


sesearch is a great tool that we use all the time.  It allows you to analyze and look the the SELInux policy.  It is part of the setools-console package.  It uses the "Apol" libraries to examine policy, the same libraries we have used to build the new tool set sepolicy.

The problem was that he was using sesearch incorrectly.  sesearch -A shows you all possible, allow rules not just the allow rules that are currently in effect.

The user needs to add a -C option to the sesearch.  The -C options shows you the booleans required for that access.  It also shows a capital E or D indicating whether or not the boolean is enabled or disabled in policy at the beginning of the line.

On my machine, I will use a more complicated command, this command says show the allow rules for a source type of httpd_t, and a target type of user_home_t, permission=read on a class=file.

sesearch -A -C -s httpd_t -t user_home_t -p read -c file
Found 1 semantic av rules:
DT allow httpd_t user_home_type : file { ioctl read getattr lock open } ; [ httpd_read_user_content ]

As you can see on my machine the boolean is disabled, so Apache is not allowed to read general content in my homedir, which I assume was true for the user.   If  the user wants to allow httpd_t to read all general content in the users homedir you can turn on the httpd_read_user_content boolean.

If you want to allow it to read just a certain directories/files, recommended,  you should change the label on the directory.  BTW ~/public_html and ~/www already have the correct labeling.

matchpathcon ~/public_html ~/www
/home/dwalsh/public_html    staff_u:object_r:httpd_user_content_t:s0
/home/dwalsh/www    staff_u:object_r:httpd_user_content_t:s0

I would not want to let the apache process read general content in my homedir, since I might be storing critical stuff like credit card data, passwords, and unflattering pictures of me in there. :^)

What is this new unconfined_service_t type I see on Fedora 21 and RHEL7?

Wed, 10 Sep 2014 20:18:36 GMT

Everyone that has ever used SELinux knows that the unconfined_t domain is a process label that is not confined.  But this is not the only unconfined domain on a SELinux system.  It is actually the default domain of a user that logs onto a system.  In a lot of ways we should have used the type unconfined_user_t rather then unconfined_t.By default in an SELinux Targeted system there are lots of other unconfined domains.  We have these so that users can run programs/services without SELinux interfering if SELinux does not know about them. You can list the unconfined domains on your system using the following command.seinfo -aunconfined_domain_type -xIn RHEL6 and older versions of Fedora, we used to run system services as initrc_t by default.  Unless someone has written a policy for them.  initrc_t is an unconfined domain by default, unless you disabled the unconfined.pp module. Running unknown serivices as initrc_t allows administrators to run an application service, even if no policy has never been written for it.In RHEL6 we have these rules:init_t @initrc_exec_t -> initrc_t init_t @bin_t -> initrc_t If an administrator added an executable service to /usr/sbin or /usr/bin, the init system would run the service as initrc_t.We found this to be problematic, though.  The problem was that we have lots of transition rules out of initrc_t.  If a program we did not know about was running as initrc_t and executed a program like rsync to copy data between servers, SELinux would transition the program to rsync_t and it would blow up.  SELinux mistakenly would think that rsync was set up in server mode, not client mode.  Other transition rules could also cause breakage.  We decided we needed a new unconfined domain to run services with, that would have no transition rules.  We introduced the unconfined_service_t domain.  Now we have:init_t @bin_t -> unconfined_service_t A process running as unconfined_service_t is allowed to execute any confined program, but stays in the unconfined_service_t domain.  SELinux will not block any access. This means by default, if you install a service that does not have policy written for it, it should work without SELinux getting in the way.Sometimes applications are installed in fairly random directories under /usr or /opt (Or in oracle's case /u01), which end up with the label of usr_t, therefore we added these transition rules to policy.# sesearch -T -s init_t  | grep unconfined_service_t type_transition init_t bin_t : process unconfined_service_t; type_transition init_t usr_t : process unconfined_service_t; You can see it in Fedora21.Bottom LineHopefully unconfined_service_t will make leaving SELinux enabled easier on systems that have to run third party services, and protect the other services that run on your system.Note:Thanks to Simon Sekidde and Miroslav Grepl for helping to write this blog.[...]

Think before you just blindly audit2allow -M mydomain

Fri, 05 Sep 2014 12:35:01 GMT

Don't Allow Domains to write Base SELinux TypesA few years ago I wrote a blog and paper on the four causes of SELinux errors.The first two most common causes were labeling issues and SELinux needs to know.Easiest way to explain this is a daemon wants to write to a certain file and SELinux blocksthe application from writing.  In SELinux terms the Process DOMAIN (httpd_t) wants to write to the file type (var_lib_t)and it is blocked.  Users have potentially three ways of fixing this.Change the type of the file being written.The object might be mislabeled and restorecon of the object fixes the issueChange the label to httpd_var_lib_t using semanage and restoreconsemanage fcontext -a -t httpd_var_lib_t '/var/lib/foobar(/.*)?'restorecon -R -v /var/lib/foobarThere might be a boolean available to allow the Process Domain to write to the file typesetsebool -P HTTP_BOOLEAN 1Modify policy using audit2allowgrep httpd_t /var/log/audit/audit.log | audit2allow -M myhttpsemodule -i myhttpd.ppSadly the third option is the least recommended and the most often used.  The problem is it requires no thought and gets SELinux to just shut up.In RHEL7 and latest Fedoras, the audit2allow tools will suggest a boolean when you run the AVC's through it.  And setroubleshoot has been doing this for years. setroubleshoot even will suggest potential types that you could change the destination object to use.The thing we really want to stop is domains writing to BASE types.  If I allow a confined domain to write to a BASE type like etc_t or usr_t, then a hacked system can attack other domains, since almost all other domains need to read some etc_t or usr_t content.BASE TYPESOne other feature we have added in RHEL7 and Fedora is a list of base types.  SELinux has a mechanism for grouping types based on an attribute.We have to new attributes base_ro_file_type and base_file_type.  You can see the objects associated with these attributes using the seinfo command.seinfo -abase_ro_file_type -x   base_ro_file_type      etc_runtime_t      etc_t      src_t      shell_exec_t      system_db_t      bin_t      boot_t      lib_t      usr_t      system_conf_t      textrel_shlib_t$ seinfo -abase_file_type -x   base_file_type      etc_runtime_t      unlabeled_t      device_t      etc_t      src_t      shell_exec_t      home_root_t      system_db_t      var_lock_t      bin_t      boot_t      lib_t      mnt_t      root_t      tmp_t      usr_t      var_t      system_conf_t      textrel_shlib_t      lost_found_t      var_spool_t      default_t  [...]