Q: Who can use H++?
A: Unlimited free access to this site is restricted to non-profit use (including academic and educational). If you intend to use it for any other purpose, please contact the development team to obtain proper license. (Regardless of the purpose, you can test run the site for up to three weeks. But please avoid submitting very large structures. )
Q: Do I need to log in?
A: No, it is not necessary to log in, you can process
(small) structures anonymously.
Q: Why would I want to register?
A: Registering allows you to keep your results for more than a single session. This will also allow you to have access to more advanced features, such as being able to process large structures.
Q: How do I register?
A: Click on the "Register" button on the left hand side of the screen, to the right of the "Login" button or go to: The Registration Page
Q: Do I need to keep the browser open once the calculation has started?
A: If you are a registered user you can log out
later to check your results (or to see the error/warning messages).
If you use H++ anonymously, you can not access your results or error messages
once you log off.
Q: How can I obtain the source code for H++?
A: Please contact the development team. Notice, however, that
installation of the suite is fairly not-trivial; several additional codes will have to be installed as well. We do not recommend attempting it without good sys admin skills. Testing is a long process and is not automated: if the code "runs" on a few structures it does not mean it "works". We have not tested H++ on any platform other than what it currently runs on.
Also note that our team has no resources to provide any support beyond the web version that we are maintaining.
Q: What is the methodology behind H++?
A: The approach is based on classical continuum electrostatics and
basic statistical mechanics. As such,
it contains several approximations
to reality. The advantage of the approach is its rigorous basis (within the above framework): it does not contain heuristic fits to massive data sets or empirical approximations. One should keep in mind, however, that this by no means translates into perfect agreement with experiment when applied to real biomolecules. See a brief discussion below about the accuracy.
Q: What input parameters should I choose?
A: For typical physiological conditions, using the default value of 80 for the external dielectric (water)
and salinity = 0.15 M is reasonable. The situation is less straightforward with the internal dielectric. If you are mostly
interested in deeply buried residues, a lower value is recommended such as
4. If your focus is residues closer to the surface, a higher value is appropriate, such as 10 or even 20. See a few suggestions (two Qs down) on how these ideas can be used to improve the accuracy of your calculations.
Q: How fast is the current version of H++ server?
A: The actual processing time will vary depending on the load
on the H++ server. The following examples provides a rough estimate of the completion
time of the current version. The completion time for a molecule of 12 titratable
sites such as 1vii is approximately 18 sec, 5 minutes for a molecules of 111
titratable sites (1AD2), and 45 minutes for a large molecule of 360 titratable
Q: Is pK_(1/2) reported by H++ the same as pKa?
A: Generally yes, but not always.
By definition, pK_(1/2) is the mid-point of a titration curve. In the majority of cases the
latter is well approximated by the classical sigmoidal
(Henderson-Hasselbalch ) shape, in which case pK_(1/2) = pKa.
If, however, the titration curve deviates strongly from the classical
Henderson-Hasselbalch sigmoidal shape, pKa is no longer a good approximation for pK_(1/2).
More on this problem can be found in this publication
Q: What is pK int ?
A: This is an hypothetical pK of a group assuming that it does not interact with any other titratable
group in the protein. The concept is sometimes useful for analysis of the calculation, but in most cases
you don't have to worry about it. For more details refer to
pKa of Ionizable Groups in Proteins: Atomic
Detail from a Continuum Electrostatic Model. by D. Bashford and M. Karplus;
Biochemistry, 29 10219--10225,
Q: What is the accuracy (compared to experiment) of the pK values
computed by this server?
A: The answer may not be as straightforward as one may want it to be.
Generally, the single-structure continuum solvent methodology used here is believed to have an
error margin of an order of 1 pK unit, on average
However, the deviation may be
larger in some cases (in particular, for CYS groups). In most cases you can be reasonably sure that the
over-all trend (whether the pK of a given site is shifted up or down relative to the standard value in solution) is predicted correctly. As with all computational methods, computed differences are more accurate than the absolute values:
predicting an effect of a point mutation on the pK of a nearby site may be more accurate than calculating the absolute value of that pK. Likewise, quantitative predictions of the
changes in a pK due to a well-defined local conformational change should
be well within reach of the methodology. As with any pK-predicting method,
lower resolution X-ray
structures tend to provide worse accuracy; in particular,
older structures with resolution
lower than 2.5 A may be especially dangerous in that respect. If available,
it is always
a good idea to run a few different structures corresponding to the same
biological molecule and compare the results:
a consensus (e.g. geometric mean) pK
value is a better approximation to reality. One should also try and go
through the log messages generated by H++, in particular "leap.log".
If heavy atoms were missing in the original X-ray structure and were added
by H++, one should be particularly careful in trusting the computed pKs
of the residues in the immediate vicinity of the added atoms.
We strongly recommend consulting the relevant literature before
using the server in your
work. Some references are available in the Methodology description on the home
Here is a comparison of H++ generated pKs with the corresponding
experimental values for
a set of high quality protein structures.
Q: Ok, I am not satisfied with the accuracy of my calculations.
Can I do something within the H++ environment to improve it?
A: Yes, but this will require additional work.
If your focus is only a few
groups, you can set the internal dielectric in accordance with where the groups
are in the molecule (see above and the following link
to see how
the internal dielectric affects the results).
Do a full calculation first, then
look up the "your_molecule.summ" file in the "Listing" on the results page.
This will show you the various contributions to the pK of your focus group.
In particular, the "delta self" term is the desolvation penalty, indicative
of how buried the group is. If its absolute value is relatively large
(say, > 2 pk units), it means that a lower internal dielectric value (4) may be more appropriate for
this group. On the other hand, a small "delta self" is telling you
that the group is close to the surface, in which case you may benefit from
using a higher dielectric of 20. You can repeat this procedure
separately for each group of interest: limiting the over-all number of titratable groups (see below) will speed up the calculations.
See also Demchuk, E. and R. C. Wade, ``Improving the continuum dielectric approach to calculating pKa's of ionizable groups in proteins", J. Phys. Chem., 100, 17373, (1996).
It is also a good idea to use the best (highest resolution) X-ray structure
for pK calculations. If several good structures are available, select 2-3
most accurate ones and average the result.
Q: Will H++ protonate or deprotonate the C-terminus carboxyl or N-terminus amine groups?
A: No. H++ does calculate the pK_(1/2) values for these groups.
However, the protonation state of these groups are not changed in the output pdb file,
even if they should be based on the input pH value.
This is because, currently, standard Amber force field parameters are only available
for these groups in their default protonation state.
Q: How can I cap protein termini?
A: Protein termini can be capped using the following procedure:
1. Process your uncapped structure through H++.
2. Download the output PDB file generated by H++.
3. To cap the C-terminus:
(a) change the C-terminus OXT atom to N,
(b) change its residue name to NME (N-methylamid) or NHE (amide), and
(c) increase its residue number by 1.
4. To cap the N-terminus:
(a) delete the N-terminus H1 and H3 atoms
(b) change the N-terminus H2 atom to C,
(c) change its residue name to ACE,
(d) decrease its residue number by 1, and
(e) move the N-terminus C atom to the front of the chain.
Q: Can H++ process phosphorylated side chanse?
A: Yes, the following phosphorylated amino acids can be processed by H++:
PTR: phosphotyrosine with a 2- net charge (using PDB nomenclature)
SEP: phosphoserine with a 2- net charge (using PDB nomenclature)
TPO: phosphothreonine with a 2- net charge (using PDB nomenclature)
Y1P: phosphotyrosine with a 1- net charge (nonstandard nomenclature)
S1P: phosphoserine with a 1- net charge (nonstandard nomenclature)
T1P: phosphothreonine with a 1- net charge (nonstandard nomenclature)
See N. Homeyer, A.H.C. Horn, H. Lanig, and H. Sticht.
"AMBER force-field parameters for phosphorylated amino acids in
different protonation states: phosphoserine, phosphothreonine,
phosphotyrosine, and phosphohistidine.
J. Mol. Model. 12:281-289, 2006. for force field parameters.
Q: What is this mysterious "protonation state diagram"?
A: It represents the lowest protonation microstates in your system, along with their relative energies. This is a more fundamental description of titratable system than that based on pKs, and may be useful in many cases. For the definition and a usage example, see this publication . The cartoon on the H++ results page only show a few lowest states for select residues (difference between the lowest and the next protonation microstate < 3 kcal/mol), a link to the full list of the lowest 128 microstates is available on the results page. The line at the top of the list
shows titratable residues in the order corresponding to the occupancies
given below: "1" - protonated, "0" - deprotonated.
One caveat is that the titratable group index in this diagram always
starts from one, that is it may be shifted relative to the input PDB file.
Q: My structure has multiple chains. Which one will be used for the calculation?
A: Generally, all of them. Be careful, though. While most chains are
"legitimate" chains corresponding to subunits of the whole multi-mer (as in e.g. hemoglobin), some PDB files use chain identifier "A", "B", etc.
to denote different
models of the same monomer (as in e.g. 2TRX).
Make sure you supply only the one you want. For NMR structures,
you get a window where you can choose which model to use.
Q: My structure has missing residues in the middle of the sequence
Will H++ still compute pKs and protonate the structure?
A: No. H++ will read in your structure and report an error. pK estimates of any kind depend critically on fine details of your structure:
if chunks of the structure are missing, the
computed pKs may sometimes be completely off, especially in the vicinity of the missing part. So you have to be very careful with what you do next.
The safest approach is to find another PDB structure which does
not have missing residues. If one is not available, there are still
a couple of things you can try.
If the groups you are interested in are not
located in the vicinity of
the structural gaps (in real 3D space, not in sequence space!), you may
re-do the computation by treating discontiguous
parts of the structure
as separate chains. Simply insert a "TER" in each gap in the PDB
Make sure that residues in each new ``chain"
are numbered sequentially, without gaps.
Generally, the more residues are missing, and the closer is the group
of interest to the missing space, the less reliable the results
are. A somewhat safer, but much more laborious approach is
reconstructing the missing parts, e.g. by homology modelling. Still,
if the group of interest is in the immediate vicinity
(within a few Angstroms) of the gap, the only reliable way to compute
its pK is to use
an experimental structure without gaps.
Note, however, that if only a few heavy atoms in a residue are missing, H++ will add them automatically and proceed. You have to be careful
though -- if these atoms lie close to the region of interest, the
accuracy may be affected. Always check the corresponding log files.
Hydrogens are always added by H++ if they
are not available (most likely scenario) from the original structure, but that's OK. We suggest that you examine your structure carefully before
submitting. While H++ is designed to catch many problems and report
them to you, you can not rely on it to ``proofread" your input.
Q: How can I keep specific buried water molecules?
A: Water molecules in the input PDB file can be identified
by the residue name "HOH".
Edit the input PDB file and change "HETATM" to "ATOM"
for the water molecules you want to keep.
NOTE: If the PDB file only contains the O atom, and not
the H atoms, H++ will add the missing H atoms
followed by a crude optimization step.
Q: How can I keep an explicit ion in my computation?
A: Generally, all ions are automatically stripped-off in the beginning of the H++ process. However, you can explicitly
tell H++ to keep specific ion(s) by using "ATOM" record for your ion(s), as opposed to "HETATM" in your input PDB. Also,
you will have to use specific atom and residue names for the ions,
which can be found here .
Generally ions that represent ligands can be kept,
however we do not recommend keeping ions that are part of
since these ions are implicitly included in the
implicit solvent model used by H++.
Q: How can I calculate the pKs for membrane embedded proteins?
A: H++ uses the lipid17 AMBER force field parameters for lipids
(Skjevik et. al. 2012).
To calculate the pK of membrane embedded proteins, first construct a pdb file containing the protein embedded in the membrane, see e.g. http://ambermd.org/tutorials/advanced/tutorial16/.
Ensure that the atom and residue names in the pdb file conform to the convention used for the lipid17 force field parameters
(Skjevik et. al. 2012).
Then, process the pdb file through H++, with the "Correct orientation of ASN, GLN and HIS groups, add H atoms, and assign HIS H atoms to the or O, based on van der Waals contacts and H-bonding" option deselected.
Q: My structure contains a ligand, but it is getting stripped off. Is there any way to keep it for the calculation?
A: Yes, there are three different ways in which ligands can be included:
1. If the ligand is a protein, peptide, DNA or RNA, the current version of
H++ should in most cases handle it safely automatically -- you can submit the structure
in regular PDB format
(but make sure that the ligand records are "ATOM" and not "HETATM"). The
same procedure will work if you decide to keep a water molecule in, just make
sure its residue name is "HOH" (May cause trouble in multi-chain proteins,
the safest option is to strip all water molecules for multi-chain-proteins ).
Note, however, that solvation effects are
already accounted for (implicitly) by the H++ methodology, and so
keeping explicit water molecules in the
structure is generally not recommended for pK calculations.
Of course, there are exceptions to this rule: please consult relevant
literature for details.
2. For many other types of ligands, H++ can still handle them
automatically, but you need to be very careful. First, you will need to edit the PDB file
and change "HETATM" to "LIGAND" ( columns 1-6 ) for each ligand atom.
At the moment only one ligand be can processed per run.
Click here for further details and to see an example.
For some complex ligand structures, such as those containing the heme group, the above method may not work,
in which case, consider the alternative described below.
3. Alternatively, you can "manually" add charges and atomic radii records to your PDB,
and then input your file in the PQR (PDB + charge + radius) format.
This way, it will bypass the "clean-up" routines and will go straight into the pK calculating part.
CAVEAT: the titratable groups in the input PQR file must be in their
standard protonation states (doubly protonated for "HIS", which must be renamed into "HIP") with correct
number of protons present; the atom and residue names must follow the
AMBER convention. Also, to be on the safe side make sure that all records are only "ATOM" or/and "HETATM".
One way to get what you need is to first upload your PDB file without the ligand,
run H++ on it, and retrieve the "your_molecule.replaced.pqr" PQR file from the results page. This one has all the names and protonation states set right for the task. Then, all you have to do is add the missing ligand records to the file (and make sure its extension is ".pqr") and process it through H++ again.
Do not use the final PDB (PQR) file that H++ produces -- the protonation state of this one has been set according to calculated pKs and
the specified pH, and does not necessarily correspond to the ``standard" one.
Note that if you want your ligand treated as a
titratable group but it is not one of the
standard amino-acids, life becomes more complicated, that is the above simple solution won't work. To see how these types of problems are handled, see
e.g. this publication . Contact us if you are working on a real
research project and this problem becomes an insurmountable stumbling block.
Q: Ok, but if I am to prepare the PQR file manually, where do I get the charges and radii for the ligand, required
by the PQR format?
A: This depends on what kind of ligand that is. For ions, or something
very simple like water, the best way is to take these from the
AMBER data base.
We are also building a data-base of PQR files for more
complex ligands, see here .
For other ligands, you will most likely
need to compute charges from scratch, unless you have
access to pre-computed ones for your specific ligand.
H++ users have reported using
PRODRG2 server for the purpose. Ideally, you would want to
perform a high quality QM calculation followed by a charge fitting
procedure such as RESP. As for the radii, these follow
a very simple pattern, see any PQR file generated by H++ (currently,
Bondi set is used).
When constructing a PQR file,
always remember to separate different chain records by a "TER", and
put a "TER" in the end as well to let H++ know which residues
should be treated as terminal.
Q: Can I limit the set of titratable groups to be included in the calculation?
A: Yes. But doing it blindly may be risky, as
some important interactions may get left out. The advantage is, of course,
speed. And, if you focus on a particular small group of residues you
can take dvantage of choosing a more appropriate values of the input parameters,
such as the internal dielectric. Proceed with caution.
In any case, we recommend that if your structure is small enough
you first do a "full" run to identify which
sites do not interact strongly with the sites you want to focus on --
H++ outputs a specific file that lists
residues that contribute most to each pK shift. As a result of such a run you will have a "your_molecule.replaced.pqr" file (in "View all files generated for this run: Listing") which will become
your next-step input structure in PQR
format. It has all ionizable groups in ther standard protonation states. Now
suppose you DO NOT want residue "GLU 35" to be treated by H++ as titratable.
Change its name into "GLX 35", and it will be considered by H++ as non-ionizable when you upload the modified structure in the PQR format.
By default, the server treats only GLU, ASP, ARG, CYS, LYS, HIP (which is the AMBER name for doubly
the protonated HIS), and TYR as titratable, and any other name as
non-titratable in the PQR format. Do not forget that the input file
must have .pqr extension to be processed in this manner by H++.
Limiting the number of groups treated as titratable may be critical for
large structures. You can cautiously assume that groups further than 15 Angstroms away from your group of interest do not matter much, and can be ``made" non-titratable using the above trick.
Q: What parameters (force-field, etc) are used to set up AMBER topology and coordinate files?
A: The AMBER ff19SB force field for proteins (Tian et. al. 2020),
ff99bc0+bsc1 for DNA (Ivani et. al. 2016),
ff99bsc0_caseP_Shaw for RNA (Tan et. al. 2018),
lipid17 for lipids(Skjevik et. al. 2016) for lipids,
and the mbondi2 radii set;
Two options are available for generating Amber format
topology and coordinate files with an explicit water box:
the classic TIP3P (Jorgensen et. al. 1983)
3-point water model, or the more accurate OPC
(Izadi et. al. 2014) 4-point water model.
Åqvist ion parameters are used for common monovalent ions
(Li+, Na+, K+, Rb+, Cs+, Cl-, F-, I-, Br-)
as optimized for AMBER by
Joung and Cheatham 2008,
and the ion parameters by Li et. al. 2013
for other monovalent and multivalent ions.
All disulfate (CYX-CYX) bonds found in the structure are set.
Also note that in the current implementation,
the terminal residues are always
left in their standard protonation states, even though their pKs
may indicate otherwise.
Q: Can I calculate the matrix of electrostatic site-site
A: Yes. These are calculated and are in "your_molecule.g".
This file, along with
many other useful auxiliary files, are available from the last page, in
"View all files generated for this run: Listing". Note, however,
that the groups in some of these files maybe
numbered sequentially starting from residue 1 -- that is there maybe a constant offset of indices relative to your input structure.
Q: Can I obtain the breakdown of the electrostatic contributions
to pK into the "Born" (desolvation penalty)
and "Background" terms?
A: Yes. These are in the "your_molecule.summ" file. See the above.
A very detailed decomposition of the energetic contributions
to pK from every residue is available from
Q: What is the "flip" option on the parameter selection screen?
A: The N and O atoms in the amide groups of ASN and GLN,
and the N and C atoms in the imidazole ring of HIS,
can not be easily distinguished from electron density maps.
Thus the orientations of these atoms are frequently
incorrect in PDB structures.
The reduce software from the
at Duke University is used to identify the preferred orientation for
these atoms based on van der Waals contacts and H-bonding
(Word et. al. 1999).
reduce also adds missing H atoms
and standardizes the bond length and bond angles of existing H atoms
in the input PDB file.
If the flip option is selected on the parameter selection screen,
the orientation of the amide N and O atoms in ASN and GLN,
and the imidazole N and C atoms in HIS in the PDB file
may be flipped as determined by reduce.
If the flip option is selcted,
H++ also uses the added and standardized H atom placement for
Q: Which tautomer does H++ use when adding H atoms to HIS?
A: If the flip option (described above)
is selected on the parameter selection screen,
then the HIS tautomer, delta or epsilon, is determined by
reduce based on van der Waals contacts and H-bonding.
In the case where reduce determines that the HIS
is doubly protonated (HIP), H++ assumes that the
singly protonated state is the epsilon tautomer (HIE)
for the purpose of pK calculations.
If the flip option is not selected,
then H++ assumes the epsilon tautomer (HIE)
unless specifically identified as the delta tautomer (HID)
in the incoming PDB file.
Q: Which tautomer does H++ use when adding hydrogens to GLU and ASP?
A: For GLU and ASP the hydrogen is added to the OE2 corbonyl oxygen atom,
Q: How can I visualize the effect of protonation state changes?
A: Click the image below for a free, open-source utility (GEM) that allows coloring the surface
of the H++ generated structures with the electrostatic potential.
Compare structures generated at different pH values.
Åqvist, J., (1990).
Ion-water interaction potentials derived from free energy perturbation simulations.
J. Phys. Chem.,
Vol. 94, No. 21, pp. 8021-8024.
Banas, P., et. al. (2010).
Performance of molecular mechanics force fields for RNA simulations:
Stability of UCG and GNRA hairpins.
J. Chem. Theory Comput.
vol. 6, pp. 3836-3849.
Hornak, V., Abel, R., Okur, A., Strockbine, B.,
Roitberg, A., and Simmerling, C. (2006).
Comparison of multiple Amber force fields and development of improved protein backbone parameters.
Proteins: Structure, Function, and Bioinformatics,
Vol. 65, No. 3, pp. 712-725.
Izadi, S., Anandakrishnan, R., and Onufriev, A.V. (2014).
Building Water Models: A Different Approach.
J. Phys. Chem. Lett.,
Vol. 5, No. 21, pp. 3863-3871.
I. Ivani; P. D. Dans; A. Noy; A. Pérez; I. Faustino; A. Hopsital; J. Walther; P. Andrió; R. Goni; A. Balaceanu;
G. Portella; F. Battistini; J. L. GelpÃ; C. González; M. Vendruscolo; C. A. Laughton; S. Harris; D. A. Case; M. Orozco.
Parmbsc1: A refined force field for DNA simulations.
Nature Meth., 2016, 13, 55–58
Jorgensen et. al. (1983).
Comparison of simple potential functions for simulating liquid water.
J. Chem. Phys.
vol. 79, pp. 926-935.
Joung, S., and Cheatham, T.E., (2008).
Determination of alkali and halide monovalent ion parameters for use in
explicitly solvated biomolecular simulations.
J. Phys. Chem.,
Vol. 112, pp. 9020-9041.
Krepl, M., et. al. (2012).
Reference simulations of noncanonical nucleic acids with different
variants of the AMBER force field:
Quadruplex DNA,Quadruplex RNA, and Z-DNA.
J. Chem. Theory Comput.
vol. 8, pp. 2506-2520.
Li, B, Roberts, B.P., Chakravorty, D.K., and Merz, K.M. (2013).
Rational design of particle mesh ewald compatible Lennard-Jones
parameters for +2 metal cations in explicit solvent.
J. Chem. Theory Comput.
vol. 9, pp. 2733-2748.
Perez, A., Marchan, I., Svozil, D., Sponer, J.,
Cheatham, T.E., Laughton, C.A., and Orozco, M. (2007).
Refinement of the AMBER Force Field for Nucleic Acids: Improving the Description of alpha/gamma Conformers.
Vol. 92, No. 11, pp. 3817-3829.
] Å. Skjevik; B. D. Madej; C. J. Dickson; C. Lin; K. Teigen; R. C. Walker; I. R. Gould.
Simulations of lipid bilayer self-assembly using all-atom lipid force fields.
Phys. Chem. Chem. Phys., 2016, 18, 10573–10584.
D. Tan; S. Piana; R. Dirks; D. Shaw.
RNA force field with accuracy comparable to state-of- the-art protein force fields.
Proc. Natl. Acad. Sci. USA, 2018, 115, E1346–E1355.
C. Tian; K. Kasavajhala; K. Belfon; L. Raguette; H. Huang; A. Migues; J. Bickel; Y. Wang; J. Pincay; Q. Wu; C. Simmerling.
ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution.
J. Chem. Theory Comput., 2020, 16, 528–552
Wang, J., and Kollman P.A. (2001).
Automatic parameterization of force field by systematic search and genetic algorithms.
Journal of Computational Chemistry,
Vol. 22, No. 12, pp. 1219-1228.
Word, M.J., Lovell, S.C., Richardson, J.S., and Richardson, D.C. (1999).
Asparagine and Glutamine: Using Hydrogen Atom Contacts in the Choice of
Side-Chain Amide Orientation.
Journal of Molecular Biology,
Vol. 285, pp. 1735-1747.
Zgarbova, M., et. al. (2013).
Toward improved description of dna backbone:
Revisiting epsilon and zeta torsion force field parameters
J. Chem. Theory Comput.
vol. 9, pp. 2339-2354.