Java Apps and the External Environment
Table of Contents
1 Overview
Java applications, although isolated by the jvm and standard library
from most OS details, need to interact with the environment to be
useful. We need to configure the app and this is often file based,
and we need to store data and logfiles somewhere.
The danger is that this will introduce subtle dependencies to the
underlying operating system. This causes friction when used in other
environments like development and testing.
Alternatively certain OS-es make assumptions on how things are
handled there. Debian has some strong opinions on what should be
where.
2 Patterns for External Files
2.1 Standalone Directory
Most java apps seem to have chosen the standalone directory pattern
as basis of their deployment.
$ tree -d Tools/apache-maven-3.0.4 Tools/apache-maven-3.0.4 ├── bin ├── boot ├── conf └── lib └── ext $ tree -d -L 2 Tools/apache-tomcat-6.0.35 Tools/apache-tomcat-6.0.35 ├── bin ├── conf ├── lib ├── logs ├── temp ├── webapps │ ├── docs │ ├── examples │ ├── host-manager │ ├── manager │ └── ROOT └── work
This has many advantages. All references to files can now be done
using relative links based from the root folder. The amount of
assumptions that need to be made about the underlying OS is minimized.
In the case of the apache folks these differences are handled in
platform specific startup scripts, but the app hardly sees anything
of it. Now these startup scripts are very complicated and are reused
and refined over many projects.
For our own projects we do not need to make such complicated startup
scripts, but this is the right place to put this glue between the OS
and the app. If it is buried in the app, it is also complicated, and
impossible for sysadmins to modify when the deployment conditions
change.
Many frameworks support this way of working by exposing the root
directory in a configuration variable, allowing easy configuration
relative to the root.
2.2 Resource Loading Files
Many Java products use resource loading files from the classpath
instead of directly opening files.
This makes it easy to provide sensible defaults in the jar
files. Maven has also special support for this by adding the
resources folder to the classpath before the classes and the jars.
In case of testing the src/test/resources is added before that. On
deployment the ${appname}/conf folder is added before the jars.
By putting the right config file in the right location for default
(src/main/resources), test (src/test/resources), deploy
(${appname}/conf), the app is properly configured without the need
for any smarts in the app.
2.3 Separate Internal Configuration from External Configuration
This is especially for Spring, i.e. it would be suicide to do
otherwise, but it is in fact generally applicable. The point is that
some configuration is intended to be changed by the sysadmins and
some is not.
Writing modular, loosely coupled software is great and good
practice. Glueing stuff together using some form of configuration
file is just as great. Now part of this is real product design and
changing it will make it a different product. This includes how the
core pieces are wired together. This part should be internal and
separated from the external config.
Other configuration is related to details which do not alter the
purpose but fill in changing details. Like ip-addresses, names,
email, database connections, … and other related detail
config options. These we will find in the external config files.
Note that significant parts of the app can be provided by plugging in
components. Of course these components need to be externally
configured too. So these are in external config files.
Import the external files and the internal files in a way that the
external files can override the internal ones.
Copy the default external config files to the ${appname}/conf
folder so the admin needing to manage it can immediately see the
defaults. Also take care to comment it so that the person editing it
does not need to be digging for the manual.
Please keep the configuration files small. The ideal application is a
zero configuration app which auto-detects its settings from existing
resources, not an app where every feature can be tweaked and
customized. Every configuration parameter need to be coded, documented,
deployed, managed, reviewed, adjusted, corrected (usually several
times). So this ends up being very expensive.
External Configuration is poison, use it in medicinal quantities (not
necessarily homeopathic quantities, if it is needed, it is needed).
2.4 Logging and Monitoring
Since both these things are essentially non-functional requirements,
they should be pushed down to the platform and out of the app.
All logging frameworks are essentially pluggable. The collect all the
log messages in a back-end independent way and send them to an actual
logging implementation, an appender, usually writing to a file, but
this could be an email, JMS message, SNMP trap, …
Of course where those messages end up is largely dependent on the
organization supporting the app and should be decided by them. So the
final loggers should be treated as external configurable components.
So the app should not get involved with the details of logging, just
add a default config with some sensible defaults (size-based rotating
log files so the dev machines and test machines do not run out of
disk space) in an external log file. Please add a comprehensive set
of commented log targets so the admins can change easily the
log levels in a granular way to support the app effectively.
Similarly the app may rely that an external monitoring system is
available which monitors the error logs for critical
errors. Document these in the Operations Manual under the monitoring
section.
Also make sure that the app behaves consistent to the protocols it is
using. A website which has an error should return error status 5xx,
referencing an entity in a REST API which does not exist should return
404, … , whatever is the norm here. This makes monitoring with
tools like Nagios a breeze, as no parsing of the page needs to be
done.
2.4.1 What if the business asks for Special Monitoring
Tough question. In principle it is now a functional business
requirement and there should be a story for it. The risk exists that
this requirement might become broken after a config change during
routine maintenance.
The best way to deal with it is to make it part of the application,
but still push it as far down in the framework/libraries as possible.
For example, if the logging framework can be leveraged, then the
internal configuration could include a predefined appender for the
business notifications, separate from the external appenders.
In practice, deal with them on a case by case basis. Maybe you can
talk the business out of it, or rely on Nagios configuration managed
by Ops? Talk to the stakeholders.
2.4.2 Real-time monitoring and administration
Allowing access to internal value, parameters and admin functions
through some standard management framework like JMX is another
interesting pattern often seen.
Implementing this is straightforward and will be exposed by a
plethora of tools providing a UI for the management of this
information so that the code can focus on the business value, not
adding stuff to manage that stuff. Just do not forget to document it,
self documentation is best of course. Also give instructions in the
Ops manual to control access to this functionality.
Some projects notify developers and stakeholders immediately when
exceptions or other things happen. Another great pattern, but try to
push it out of the app using standard features of frameworks like the
logging framework, Camel, …
Copying classes from other projects is definitely not recommended,
this is a library shouting to come out. Refactor it as a separate
module, ask to make it part of the company foundation so it is just
there when needed. Just look and ask around first if this is not a
wheel which was already invented.
2.5 Complying to OS rules through packaging.
The above assumption to store everything under a folder runs against
the grain of the Linux standards. (Although they are actually the Mac
and Windows way of working).
I’ll treat the case for debian based distros here, but the same is
possible for the redhat and other distros.
In short, use symbolic links to move the folders to the locations
where linux is happy and keep them visible in the local folder for
the JVM. Everyone happy.
2.5.1 Main Deploy folder
All read-only stuff, which is the real application stuff, is expected
somewhere beneath /usr (but not /usr/local which is reserved for
locally compiled packages which we never do).
I recommend to create the app home folder in /usr/share/${appname}
and copy all libraries, binaries, scripts, static resources, etc in
it.
2.5.2 Config files
Config files in debian are expected under the /etc folder and the
package manager will automatically flag files deployed there as
config files so this does not need to be done separately (unless you
want to change the defaults of course).
Just move the default config files to /etc/${appname} and create a
symbolic link
${appname}: ln -s /etc/${appname} conf
Well, I guess debhelpers have better tools for this, so use whatever
is usual using the buildtool you use.
2.5.3 Data files
Storing data should end up under /var somewhere. I recommend to use
a folder under */var/lib/${appname} and create folders there which
you link back to the main deploy folder. If you only need 1 data
folder you do not need to create subfolders of course.
2.5.4 Log Files
Log files are expected beneath /var/log.
Create a folder /var/log/${appname} and link this to
${apphome}/logs. Make sure the folder is owned by the user the app
will be running as.
2.5.5 Dotfiles
Now we get in the hard cases. Normally this is only needed for
desktop apps, server apps should never use personal dotfiles. However
this is one of those cases where you never should say never.
For desktop apps, use the java support for dotfiles. This will use
personal dotfiles on Unixy OS-es and the registry on
Windows. Easy-peasy for greenfield apps. Problem solved.
For 3rd party apps or libs, we have to play the hand we’re
dealt. Typical examples are .netrc which is used to store passwords
outside the app. Good practice, but major headache.
For server apps, try to avoid it. Before you know it, you can no
longer do a ‘git clone …; mvn install’ to build it. Keeping build
dependencies down is critical for long term support and easy
onboarding.
In any case they are no deployment issue other that making sure it is
documented and there are some samples available for complicated files.
2.6 Apps deployed on a runtime platform
Many java apps, components, webapps, … are deployed on some kind of
runtime, be it a servlet container, appserver, OSGi container, …
Great. Leverage it. Push all this stuff down into the container, so
you can surf on the work done by the container packager.
For instance the jboss server has a folder …/conf in the instance
being started, which is on the class path. Just dump your external config
files there with cfengine or whatever you use for deploying.
Logfiles are also taken care of as that is a service the container
should be offering. Just document the important categories and
loglevels as usual, the rest in the concern of the container admin.
In general if you deploy on a controlled environment, expect that
your external dependencies are provided by the container. Work with
the container owner to find the sweet spot.
For testing this is no issue as maven will do the right thing in
unit and integration testing.
3 Conclusion
In orded to focus on the value of apps we must be separating business and stuff like datafiles, configfiles, logfiles as far from each other as possible. It is often already difficult enough (read: expensive) to fix bugs without having all that cruft sprinkled through the codebase and essential configs. Most of the requirement posed by the details of connecting the app code and the external stuff fall in the realm of non-functional and should be
moved as much as possible outside of the programmed code and into
frameworks and runtime containers and into the hands of the admins.
The best way to deal with those external dependencies is to push
them away from the app code and ignore them for the rest. With the
guidelines above this can be realized to a large extent in a
straightforward way.
Both configuration code and configuration parameters are poison over
time. Use them in medicinal doses.