Your Daily Source for Apache News and Information  
Breaking News Preferences Contribute Triggers Link Us Search About
Apache Today [Your Apache News Source] To internet.com

The Premier Event for Grid Computing Products/Services

Apache HTTPD Links
The Jakarta Project
The Java Apache Project
Apache Module Registry
Apache XML Project
PHP Server Side Scripting
The Apache FAQ
The Apache Software Foundation
Apache Project
Apache-Perl Integration Project
Apache-Related Projects
ApacheCon
The Linux Channel at internet.com
PHPBuilder
Linuxnewbie.org
Linux Planet
Linux Central
BSD Today
Enterprise Linux Today
Linux Apps
Linux Today
Linux Programming
Linux Start
Just Linux
BSD Central
All Linux Devices
Apache Today
SITE DESCRIPTIONS
Perchild: Setting Users and Groups per Virtual Host
Aug 18, 2000, 14 :48 UTC (9 Talkback[s]) (3909 reads) (Other stories by Ryan Bloom)

One of the biggest problems with administering a major server housing multiple sites is restricting access to the sites to only those people responsible for maintaining a specific site. The reason for this is that all of the Apache child processes run with the same user and group Id. Therefore, all of the files need to be readable, writable, and executable by the user and group that the server is running as. This becomes a much bigger issue when you add CGI and PHP scripts to the site. If those scripts must access private information, then that information must be stored with relatively insecure user and group Ids.

Apache 1.3 solved this problem by introducing suexec, which introduces other problems and PHP and mod_cgi can not take advantage of it. Apache 2.0 has introduced a new MPM to solve this problem in a more elegant way that all scripts can take advantage of.

The new MPM is called Perchild, and it is based on the Dexter MPM. This means that a set number of child processes are created and each process has a dynamic number of threads. In this MPM it is possible to specify User and Group IDs for clusters of child process. Then, each virtual host is assigned to run in a specific cluster of child processes. If no cluster of child processes is specified, then the virtual host is run with the default User and Group Ids.

There were many designs considered for this MPM, but in the end only one made sense. The first consideration was which MPM to base off of. The options were the prefork, mpmt_pthread, and dexter. Prefork and mpmt_pthread had one major drawback, they create new child processes which are completely separated from each other whenever the server gets busy. This means that the parent process would need to determine what User and Group Ids the new process should have when it is created. While this seems easy at first glance, it requires load balancing techniques that begin to get very complicated. If the prefork or mpmt_pthread MPMs are desired, it makes more sense to put a load balancer or proxy in front of the web servers, and run multiple instances of Apache on different ports. To the client, this would look very similar to the Perchild MPM.

After eliminating prefork and mpmt_pthread, the only option left was Dexter. Now, the question was how to associate virtual hosts with child processes. Do we base the number of child processes on the number of virtual hosts, or do we allow the web admin to specify how the setup should look. Assuming that the more flexible we make the Perchild MPM, the more likely it was to be used, we allow the web admin to determine how their site looks. This is done through the combination of two directives:

ChildPerUserID NumChildren UserID GroupID
AssignUserID UserID GroupID

The first directive allow the administrator to assign a number of child processes to use the same User and Group Ids. This is to provide for some level of robustness. Because Perchild creates new threads in the same child process to handle new requests, it is not the most robust server, although it is very scalable. If one of the threads seg faults, then that entire process will die, taking with it all of the requests currently being server by that child process. By specifying more than one child per user/group pair, we allow the server to balance the number of requests between multiple child processes. The second directive is specified inside a VirtualHost stanza, and assigns that Virtual Host to a specific User and Group Id. The server is smart enough to combine all of the VirtualHosts with the same User and Group Ids to the same child processes.

How Does it Work?

The obvious question now, is how does this work internally. The Perchild MPM has a special global table which it uses to start children and allow those children to change to the correct user Ids. It also uses the per-server configuration to pass requests between child processes. When the MPM encounters a ChildPerUserID directive it begins to fill out the global child table. Each child process gets one place in the table, which stores the User and Group Id that the child should run as. The table also stores a socket descriptor, but it isn't filled out until later.

While parsing the configuration for each VirtualHost, if the server encounters an AssignUserId directive, it fills out a perchild per-server configuration structure, which contains the two socket descriptors. In order to do this, the server creates a set of anonymous Unix Domain Sockets which are used to pass the request between processes. After the sockets are created, the server searches the child table to find the child processes that have the same User and Group Ids. Once found, one of the socket descriptors is attached to all of the processes with that User Group combination. Both socket descriptors are attached to the specific VirtualHost that is being configured. This step is repeated for all VirtualHosts. Once all VirtualHosts have been configured, the server ensures that each host has been assigned a socket. If not, the server creates a set of default sockets and stores those in any server that doesn't already have a socket.

The next step is to create the child processes. When each process is started, it checks the global child table, and switches to the appropriate User and Group Ids. If no User and Group Id are specified for this child process, then the User and Group specified in the main server are used. Each child also adds the socket in the socket table to the list of sockets it will poll on. From here, child startup proceeds as normal with each child process polling on all of the ports opened in the parent process. This leaves the server looking like Figure 1.

Figure 1

Figure 1.

When a request comes in, the Perchild MPM is the first module called in the post_read_request phase. During this phase, the Perchild MPM ensures that the request is for the current child process. If so, processing continues as normal. If not, the child process uses the VirtualHost that is attached to the request to find the correct Unix Domain Socket to use. The child process begins by finding the socket that is currently being used to communicate with the client in the connection structure. Once this socket is found, it is passed to the correct child process through the Unix Domain socket (S1 or S2 in the diagram). Finally, the part of the request that has already been read from the client is sent to the new child over the Unix Domain socket. The original child process then closes its connection to the client, and longjmps out of the post_read_request phase to the end of processing a request. This thread then goes back to listening for another new request.

The request processing then moves to the correct child processing. Once a socket is passed over the Unix Domain socket, the new child process is woken up out of poll with data its end of the Unix Domain socket. Each child has a table over sockets to use for this occasion, there is one socket in the table for each thread in the process. Usually, the sockets are set to -1, but when the passed socket descriptor is detected, we set this thread's spot in the table to -2. Later, the fact that the socket is -2 is used to determine that we must receive the socket descriptor from the Unix Domain socket. The received socket is then placed in the thread's position in the socket table.

Processing then continues as normal, reading from the Unix Domain socket, until the post_read_request phase. At this point, we know that the request has come from another child process in our server and we know that this request is meant for this child processes User and Group Id. The only thing left to do is replace the Unix Domain socket that is currently in the connection structure with the socket that was passed from the first child process. This child then continues serving the original request.

This will never be the fastest MPM, because it relies on passing socket descriptors between processes, which is inherently a slow process. It would be much faster to give the server multiple IP addresses, and have different Apache installations listen to port 80 on different IPs. However, that can get very difficult to administer.

Alpha 5

This MPM was finished the day before the fifth alpha was released, so it is not well tested at all. Over the next few weeks and months, this MPM will become more stable and more portable. Currently, this MPM has only ever been tested on Linux, but with minor modifications, it should work on almost all Unices. There has been talk of modifying the Windows MPM to allow the threads to change their identities for each request, but that has not happened yet.

  Current Newswire:
Another mod_xslt added to the Apache Module Registry database

Netcraft Web Server Survey for December is available

O'Reilly: Apache Web-Serving with Mac OS X: Part 1

WDVL: Perl for Web Site Management: Part 3

Retro web application framework V1.1.0 release

Leveraging open standards such as Java, JSP, XML,J2EE, Expresso and Struts.

Netcraft Web Server Survey for November is available

FoxServ 2.0 Released

Ace's Hardware: Building a Better Webserver in the 21st Century

Web Techniques: Customer Number One

 Talkback(s) Name  Date
  PerChild for directories?
Hi,

Just wanting to ask a question: will the perchild directive also use for userdirs? Or special directories?

Thnx   
  Aug 19, 2000, 14:43:59
   Re: PerChild for directories?
> Hi,
Just wanting to ask a question: will the perchild directive also use for userdirs? Or special directories?
Thnx

Possibly, but not immediately. The problem comes down to resources. Because each new user and group requires a new child process, it could potentially be resource intensive to have a new child process for each directory or location. The opther problem is how to do this technically, while still providing some kind of performance. Currently, because we are using the virtual host as the deciding factor, it is possible to determine very early on whether we should be forwarding to a new child process. Because of aliases and locations, there is no place in processing a request that we can quickly determine if a request should be passed. The only place this could be done, would be the fixups phase of the request.



The problem with using the fixups phase, is that we would end up processing the request twice. Once in the child that answers the request, and again in the child that should actually be processing the request.



I am also not sure how this would work with authentication and authorization. What I am thinking is that it would be possible for a child process to be running as a user that doesn't have access to a specific password database. But, authentication is required for a given directory. The child process that is specified for the directory is the only UID with access to the database. We will try to authenticate in the original child, and the authentication will fail. That will end the request.



I am just afraid that there are too many issues to solve this quickly and easily. This is probably more than you wanted, but I was thinking through it as I answered your question. :-)

Ryan   

  Aug 21, 2000, 03:53:20
  mod_cgi & suexec
Please, what do you mean by saying that in Apache 1.3
'mod_cgi can not take advantage of suexec'?

Thank you
Jan   
  Aug 22, 2000, 09:57:40
   Re: Re: PerChild for directories?
Thank you for your reply ...

So there still would be no way of running the module version of php for my day to day lusers ... Am I correct?

Thnx,
Frank   
  Aug 22, 2000, 14:55:50
   Re: mod_cgi & suexec
That is a typo. That should have read mod_perl. I am sorry about that, I obviously didn't read that closely enough when I reviewed it.

Ryan   
  Aug 23, 2000, 14:54:08
   Re: Re: Re: PerChild for directories?
I am not sure I understand your question. If you are concerned about your users and mod_php, then you could just setup a php_only virtual host. This host could run with very low permissions, and would be responsible for just running PHP scripts.

Ryan   
  Aug 23, 2000, 14:56:11
  easier option ??
I get a similar result to what is trying to be done by putting the users http Files in their /home/public_html Dir and then with the use of ProFTPD, simply lock them into their own /home folder with the Defaultroot statement. This seems to work ok giving each user the ability to edit their own website only.

It took some fiddling but it works for me.

ed@softalk.co.uk   
  Aug 26, 2000, 10:26:17
   Re: easier option ??
> I get a similar result to what is trying to be done by putting the users http Files in their /home/public_html Dir and then with the use of ProFTPD, simply lock them into their own /home folder with the Defaultroot statement. This seems to work ok giving each user the ability to edit their own website only.
It took some fiddling but it works for me.
ed@softalk.co.uk

That sounds like it will work as far as static pages go, but it doesn't deal with running CGI scripts or SSI's as a given user, which was a big part of the problem. CGI scripts and SSI's can be a big security issue on some sites, so being able to limit just what they can do is a big problem to solve.

Ryan   
  Aug 28, 2000, 14:32:08
  PerChild

I've looked at mod_become as an alternative for apache 1.3 . Any comments on other ways it can be done on 1.3 ?


Awais
  
  Dec 6, 2001, 11:15:14
Enter your comments below.
Your Name: Your Email Address:


Subject: CC: [will also send this talkback to an E-Mail address]
Comments:

See our talkback-policy for or guidelines on talkback content.

About Triggers Media Kit Security Triggers Login


All times are recorded in UTC.
Linux is a trademark of Linus Torvalds.
Powered by Linux 2.4, Apache 1.3, and PHP 4
Copyright INT Media Group, Incorporated All Rights Reserved.
Legal Notices,  Licensing, Reprints, & Permissions,  Privacy Policy.
http://www.internet.com/