Throw more $$ at this problem, hardware is cheap, compared to people.
It would be highly unusual for the TDS not to be I/O bound, so buying a high-performance disk subsystem is much better than buying fast CPUs. Slower, more energy efficient multicore processors are optimized for web server loads.
Typically disk access is much faster on a local drive than on an NFS mounted drive. High performance disk subsystems like RAID or SANs will significantly improve TDS throughput.
If you have system admin resources, examine the possible file systems available for your OS, eg on Linux or Solaris. We are using the ZFS file system on Solaris-X86 and it is very fast. We use ZFS software RAID, which replaces hardware RAID.
The OS typically limits the number of open file handles per process. To check this value on Unix, use:
ulimit -n
If you are using the default TDS configuration values, this value should be 1024 or greater. Otherwise you can tune this number based on your own settings. For example, to set this value to 2048 in the tomcat startup.sh script:
ulimit -n 2048
This affects the number of files to keep in the File Handle Caches.
-d64 -Xmx4081m -Xms512m -server -XX:MaxPermSize=256m -XX:+HeapDumpOnOutOfMemoryError -Djava.awt.headless=true
We recommend the latest version of JDK 8, and soon we will require it.
We recommend the latest stable version of Tomcat 7. This requires JDK 1.6 or above.
Tomcat can be configured to automatically compress the responses, whenever the client allows that. Compression is usually a big win, especially for bandwidth-limited sites. Deciding when and what to compress depends on a lot of factors, however. We use the following settings in server.xml:
<!-- non-SSL HTTP/1.1 Connector on port 8080 -->
<Connector port="8080"
protocol="HTTP/1.1"
maxThreads="50"
connectionTimeout="20000"
redirectPort="8443"
compression="1000" compressableMimeType="text/html,text/xml,text/plain,application/octet-stream" />
This says to compress (gzip or deflate) when the number of bytes is >= 1000, for the named mime-types. See Tomcat HTTP Connector reference page for more details.
In a production environment, Tomcat should be automatically restarted when the machine starts. How to do this depends on what OS you are running. This FAQ has a bit of info.
Once thredds.war is expanded, manually copy everything in ${tomcat_home}/webapps/thredds/initialContent/root/ to ${tomcat_home}/webapps/ROOT/ .
The TDS caches file handles to minimize OS overhead. Currently the defaults assume that the tomcat process is limited to 1024 file handles. If you can allow more, you can increase the sizes of the FileCaches for more performance. You can change these settings in the threddsConfig.xml file.
These numbers limit performance, but not functionality. For example, the number of files in an aggregation is not limited by these file handle limits.
Each NetcdfFile object encapsolates a file. NcML aggregations are careful not to keep component files open. When number of cache files > maxElementsInMemory, a cleanup thread starts after 100 msecs. So the number of cached files can get larger than maxElementsInMemory in the interim, but unless you are really hammering the OS by opening many files per scond, it shouldnt get too much bigger. But leave some cushion, depending on your expected rate of opening files.
The TDS writes temporary files and caches files. By default these are stored under ${content_root}/thredds/cache. These directories can get large. You might want to relocate them to another place, for example if ${tomcat_home} has limited space. Also, theres no need to backup the cache directories, so they can be placed on a disk that is not backed up. The easiest thing to do is to create a symbolic link from ${content_root}/thredds/cache to wherever you want thes files to live.
The OPeNDAP-Java layer of the server currently has to read the entire data request into memory before sending it to the client (we hope to get a streaming I/O solution working eventually). Generally clients only request subsets of large files, but if you need to support large data requests, make sure that the -Xmx parameter above is set accordingly.
If you are serving GRIB files through any of the subsetting services (OPENDAP, WCS, etc), the CDM must write indices the first time it tries to read it. This can take several minutes for very large GRIB files. For large aggregations and collections, this can take hours or even days. By indexing GRIB files before they are accessed with the TDM, users get much faster response time. As of TDS 4.6+ f these collections change, you must use the TDM to detect those changes, the TDS will no longer update GRIB collections on the fly.