Both Science and C&E News have acknowledged the current 'revolutionary advance' in cryo-electron microscopy single-particle reconstruction.
These advances in resolution of reconstructions use new direct electron capture cameras and publications that I have seen utilize Relion software for the reconstruction.
I am still uncertain how much of the improved resolution arises from the improved software. At issue are not only the reconstruction methodology but also the resolution metric.
If Relion is a significant source of the improvement then there arises a question of the future role of other softwares in reconstruction. Currently Relion is able to handle most of the reconstruction pathway except for particle selection (windowing) and initial reference model construction.
These other softwares include: SPIDER, EMAN2, SPARX, Xmipp, IMAGIC, Bsoft, and SIMPLE and some others. These softwares still contain some capabilities not found in Relion. e.g.
With the exception of these capabilities what is the future function of these softwares? Will they survive Relion's ascent? How much future development should be done on them? What will be the impact on funding for software other than Relion?
EM Software development funding by NIH in the US is currently in a rather bad state. Both SPIDER and IMOD and its associated software have lost major or all of their funding. At NIH almost all software development grants, for widely different purposes, compete directly and also compete with funding for various biological databases. This lack of targeting leads to poor quality reviewing.
E.g. In the case of SPIDER one of three reviewers of our most recent grant application stated:
"the number of investigators employing SPR is limited and not expected to grow substantially".It is difficult for me to see how a knowledgeable reviewer could come to such a conclusion in the midst of a 'revolutionary advance'.
There does not appear to be any viable non-grant mechanism for the continued maintenance of scientific "Free Open Source Software". Is it reasonable to hope that researchers will direct voluntary monetary donations to software developers as some have suggested? Can researchers even get such a contribution approved by their local grant administrators? Do their auditors OK such an unobligated contribution? There are additional problems with currency conversion. Certainly the red tape involved in both donating and accepting a donation conspire against this idea. Up until now most software development has existed as sort of a side-operation of previously fairly well funded EM labs, in our case a 'NIH research resource'. Such funding is increasingly at risk and long-term development and maintenance of software is disappearing.
This uncertainty in funding confounds discussion on the future of EM software. Where do we go from here? Do you see continued use of SPIDER and other softwares?
29 Nov. 2012 ArDean Leith
A single particle reconstruction from cryo-EM images of non-symmetrical objects often requires 100,000 --> 1,000,000 images. If such a large number of images are stored in most common Linux filesytems, accession / addition of images will cause thrashing of the filesytem and extemely slow access. This occurs not just in processes accessing the images but throughout all access to that file system.
To overcome this thrashing one can purchase an expensive parallel file storage system (e.g. from Panasas) or more commonly aggregate the images into 'stacks', or a less commonly into a database. Most EM softwares support some sort of file based stack. Several different EM single particle reconstruction softwares support both MRC and SPIDER format files to various extents.
The MRC stack file format is an especially poor choice for your stacks. There is a single 1024 byte header for the whole stack, then individual images are concatenated into the stack without any image specific header..
4 Sept. 2012 ArDean Leith
We recently introduced improved interpolation using FBS inside several SPIDER operations. We have shown that FBS gives significant improvements over the linear and quadratic interpolation used in SPIDER previously and is as good as the much slower gridded interpolation available in SPARX.
During refinement of a reference based reconstruction interpolation is used at four steps. These are: creation of reference images from an existing reference volume, application of existing alignment parameters to the experimental images, conversion of image rings to polar coordinates, and alignment of images prior to back projection into a volume.
When we modified our recommended procedure for refinement grploop.pam using the FBS interpolation alternatives in SPIDER and tested the refinement step using actual cryo-em data we were perplexed to find a small but repeatable decline in reconstruction resolution of an overall refinement step.
We investigated this decline using a ribosome data set consisting of four sets of noisy experimental images taken at different defocus levels containing over 6000 images. The decrease in resolution is caused by the application of existing alignment rotation and translations to the experimental images, before these images are compared to the reference projections for determination of the best matching pairs. The 'RT SQ' operation uses quadratic interpolation which adds an asymmetric filter effect to the results. This filtration ended up cutting noise in the aligned experimental images so that they gave better choice of matching reference images. Poorer interpolation gave a better outcome! But this observation pointed to a method of improving the refinement step. We have added a option to denoise the experimental images prior to the reference comparison in the 'AP SHC' operation. We evaluated Fourier lowpass, averaged box convolution, median box convolution, mean shift denoising, and anisotropic diffusion denoising before settling on Fourier lowpass filter as giving the best resolution results.
We have modified our recommended refinement procedure to use FBS interpolation in: 'PJ 3F' for the creation of the reference projections, 'AP SHC' during application of existing alignment parameters to the experimental images, and in 'RT SF' for creating the view used for backprojection. We also used FBS interpolation during conversion of images rings to polar coordinates. These improvements which are present in grploop.pam gave a significant improvement in resolution over the course of a complete refinement series compared to our previous procedure.
29 Aug. 2012 ArDean Leith
We have developed a 2D and 3D Fourier-based Spline Interpolation Algorithm (FBS) in order to improve the performance of rescaling, rotation, and conversion from Cartesian to polar coordinates. In order to interpolate a two- or three-dimensional grid we use a particular sequential combination of correspondingly two and three 1D cubic interpolations with Fourier derived coefficients. A 1D cubic interpolation is a third degree polynomial:
Y(X)=A0 + A1*X + A2*X2 + A3*X3where polynomial coefficients A0, A1, A2, and A3 are calculated from the Fourier transform of the image:
A0 = Y(0)
A1 = Y'(0)
A2 = 3(Y(1) -Y(0) - 2Y'(0) - Y'(1)
A3 = 2(Y(0) -Y(1)) + Y'(0) + Y'(1)
The derivatives at grid nodes were obtained using well-known relation between Fourier transforms of the derivative and the Fourier transform itself:
F((d)f(x,y)/(d)x) = i*2*pi*k*F(k,l)
where F(k,l) is a coefficient of discrete Fourier transform series F(f(x,y))
This allows us to calculate derivatives in any local point without a finite difference approximation involving the data from neighboring points.
We compared FBS to other commonly used interpolation techniques, quadratic interpolation and convolution reverse gridding (RG). A rotation of images by FBS interpolation takes roughly 1.1-1.5 as long as quadratic interpolation, but achieves dramatically better accuracy. The accuracy of FBS interpolation is similar to RG interpolation. However, FBS rotation is approximately 1.4-1.8 times faster than RG. FBS algorithm combines the simplicity of polynomial interpolation and ability to preserve high spatial frequency. Currently it has been incorporated into several operations in the open source package SPIDER for single-particle reconstruction.
9 Mar. 2011 ArDean Leith
Since hardware speeds are stagnant or decreasing there is increased interest in optimizing SPIDER's processing speed. Since SPIDER is a general purpose EM imaging package this means different things to different users. Locally the biggest time demand for our single particle reconstructions is alignment of images with reference projections (SPIDER operations: 'AP SH' and 'AP REF'). In order to access effect of changes in compiler options I used the operation: 'AP SHC' which is the latest highly 'tweaked' version of 'AP SH'). Usual data was a set of 375x375 pixel images and a comparison of 50 experimental images versus 550 references.
30 Sep. 2010 ArDean Leith
Nvidia GPU's vary in their compute capability and the amounts of three different types of memory which have critical influence on how a problem can be approached. In addition alignment tasks usually take more than 5 minutes of GPU time which means that the GPU can not currently be shared with graphics. Thus there must be a dedicated GPU (often a Tesla/Fermi board).
Computer science publications and anecdotes commonly report speed-ups as the increase in speed of the parallelized portion of the application over speed on a single processor. In usual reconstructions (e.g. realistic ribosome reconstructions) significant time is required to read images from disk. Such input typically occupies 3-10% of the time during an alignment. If only 4% of the time is spent loading the largest possible overall speed-up is 25X. 100X is impossible overall. Another trick is to report speed-ups from a cluster of GPU enabled compute nodes, sometimes with multiple GPU's per processor.
SPIDER and other single particle reconstruction software usually have high optimized alignment operations, commonly using OpenMP or MPI. Alignment speed as tested on our dual-hexcore computer scales very well with increased number of cores (11X). Few computers today have a single core and a usefull speed-up should be defined in comparison to a reasonable computer setup not versus speed on a single core.
In EM single particle reconstruction from reference projections using programs such as SPIDER, there is a vast range of different practical applications. The number of experimental images(x), number of reference images(y), and the size of the images(z), can vary over orders of magnitudes. E.g. x=200-10,000 experimental images; y=80-5000 reference images, z=50x50 - 480x480 pixels.
The gold standard for alignment is still exhaustive search within a translation/rotation space and the alignment is usually implemented with Fourier space cross-correlation of polar images. The common algorithm has an excess of ways that the processing can be parallelized. A naive implementation on a GPU seldom results in more than a 2X speed-up. Only by tedious tuning the transfer of data within the GPU among the different memories can a speed-up of 12-20X be achieved. However a small change in the x, y, x variables mentioned above, or a change of compute capability in the GPU can completely negate the speed-up resulting in even poorer performance than without a GPU. Such a change requires a new implementation.
It is probably possible to create implementations that will give 12-20X speed-ups for any specific set of x,y,z and hardware. However a general implementation giving such speed-up is currently impossible. Multiple (10-20?) implementations will be needed for each hardware and the logic to select the implementation is complex. Each implementation requires substantial programming effort.
Currently reported alignment implementions admit that there have been unreported changes (degredationss) in search algorithms or severe restrictions on various parameters. One report gives a rotational alignment resolution of only 6 degrees. Such a restriction makes the implementation useless on images greater than 100 pixels.
We can provide a single implementation in SPIDER that can give a 16X speed-up for specific small range of parameters. However the overhead required to do so including instructions on how to interact with 9 different run-time libraries for FFT, BLAS, and NVIDIA make even this minimally usefull implementation painfull. When compared to a run on a dual-hexcore computer this is really only about an effective speed-up of 1.5X!
Currently my advice is to carefully evaluate multi-core computers versus GPU enabled computers. Only if you have a extremely heavy compute load involving a single set of x,y,z parameters would it be worthwhile to go to a GPU solution. Then you will need software that is capable of handling your specific problem parameters. Otherwise split the problem among standard multi-core compute nodes. It probably will not be much more expensive to do so. If you still need increased speed invest in a parallel filesystem for enhanced disk access (e.g. Panasas disk array).
This recommendation may change in the future and I will revisit this subject when I get access to the new Tesla GPU and the newly announced CUDA 4.0.
6 Mar. 2009 ArDean Leith
While getting ready to retire a bunch of old SGI MIPS based servers and workstations, I wondered how much faster our current AMD Opteron 64 bit Linux boxes are than our trusted old machines of 5-10 years ago. Benchmark table.
11 Feb. 2009 ArDean Leith
If you are using a Beowulf type cluster for parallel execution of time consuming operations during single particle reconstruction, there are three common methods of parallelizing discussed on our website. Since the iterative alignment and defocus group backprojection steps typically consume more than 98% of the compute time and are trivially parallelizable by defocus group, we commonly use a SIMPLE PubSub script for distributing jobs to different compute nodes. Other sites have their own scripts to handle the distribution. However if you have a inexpensive cluster with SIMPLE Ethernet networking this method has a large inefficiency when there are many nodes accessing a single storage disk or SIMPLE RAID array on a file server using NFS mounts from the compute nodes.
When many compute nodes attempt to access a single disk (or RAID array) using NFS there is a significant slowdown in overall through put. There is a lot of effort currently to overcome this problem with various methods e.g. Parallel NFS. However if your compute nodes include adequate local storage on all the nodes there is a SIMPLE solution that may improve through-put. At the beginning of a compute node computation, copy all the files that are accessed to the local disk with a systems call, then carry out the computations. At the end of the compute nodes processing, copy any altered files back to the file server.
We have recently altered the scripts that we use during the projection matching step of 3D Reconstruction so that pub_refine.pam and its associated procedures (especially pub_refine_start.pam) handle the cloning of the necessary files on local compute nodes and the transmission back to the server at the end of the processiong on the compute nodes.
On our compute cluster this modification is very productive. The speed increase will of course depend on the number of simultaneous processes, and the pattern of disk access.
Source: random.html Page updated: 1 Aug. 2014 ArDean Leith