Hartree-Fock Theory Note
- Problem statement
- The simplest case (both Vee and antisymmetry principle ignored)
- Hartree approximation (Vee considered, antisymmetry principle ignored)
- Hartree-Fock approximation
- Matrix form and charge density matrix
- Thomas-Fermi theory
- Density Functional Theory (DFT)
- Appendix
- References
Problem statement
We want to solve the following time-independent Schrodinger equation:
[−12∑i∇2i⏟Te(r)+∑A,i−ZArAi⏟VeN(r,R)+∑i<j1rij⏟Vee(r)]Ψ(r1,⋯,rN)=EΨ(r1,⋯,rN)in which ∇2i=∂2∂r2i,x+∂2∂r2i,y+∂2∂r2i,z is the laplace operator for ri=(ri,x,ri,y,ri,z), N is the number of electrons. ri,rj and RA,RB are the 3d coodinate of electrons i,j (电子) and nuciels A,B (原子核) respectively. riA=|ri−RA| is the distance between electron i and nuciel A, rij=|ri−rj| is the distance between electron i and j. ZA is the atomic number (原子序数) of nucleus A.
- Te(r): kinetic energy of electrons
- VeN(r,R): electrostatic potential (静电势) between electrons (e) and nuciels (N) (or “external potentials”)
- Vee(r): electrostatic repulsion (静电排斥) between electrons
The solution of the equation is a “wave function” Ψ(r1,⋯,rN) so that the equation always holds for any r1,⋯,rN. Additionally, |Ψ(r1,⋯,rN)|2 is the probability that electron 1 appear at coodinate r1 and electron 2 appears at coodinate r2 and so on.
Here we assume that the nuciels are fixed so all R are constants (the Born-Oppenheimer approximation). Therefore we do not consider TN(R)=∑A12MA∇2A (kinetic energy of nucleis) and VNN(R)=∑A<BZAZBRAB (Coulombic repulsion between nucleis) which are also constants.
Let h(i)=−12∇2i−∑AZArAi, then the equation can be also written as
[∑ih(i)+∑i<j1rij]Ψ(r1,⋯,rN)=EΨ(r1,⋯,rN)There are two challenges to solve this equation:
- Vee(r)=∑i<j1rij is hard to tackle, since it makes all electrons correlated. Without it, we can simply write the equation as ∑ih(i)Ψ(r1,⋯,rN)=EΨ(r1,⋯,rN). In this case, we only need to solve N single-electron equations h(i)ψi(ri)=ϵiψi(ri) for i=1,⋯,N, get N wave functions ψ1(r1),⋯,ψN(rN) (molecular spatial orbitals (空间轨道). Orbital is a wave function for a single electron, see section 2.2 of [1]), then let Ψ(r1,⋯,rN)=∏iψi(ri) (hartree product), E=∑iϵi, and the equation is solved.
- The wave function Ψ(r1,⋯,rN) needs to satisfy a specific physical property named antisymmetry principle, whose famous representation is Pauli exclusion principle (泡利不相容原理) that tell us two electrons in the same spatial orbital must have opposite spins. To illustrate this, we need to take spin ω={α,β} (自旋) into account. Let space-spin coordinates x=(r,ω) and change the molecular spatial orbital ψ(r) to molecular spin orbital (自旋轨道) χ(x)=ψ(r)ω. Then the antisymmetry principle requires that Ψ(⋯,xi,⋯,xj,⋯)=−Ψ(⋯,xj,⋯,xi,⋯). e.g., for a two-electron system, Ψ(x1,x2)=−Ψ(x2,x1), and when x1=x2 we have Ψ(x1,x2)=−Ψ(x1,x2)⇒Ψ(x1,x2)=0⇒|Ψ(x1,x2)|2=0, which means the probability of two electrons in the same space with parallel spin is zero. (By the way, this is why the electrons are called as fermions (费米子). The wave function will change its sign when two fermions swap their coordinates)
The Hartree-Fock theory tackles these challenges by the following ways:
- (Hartree Approximation) For the first challenge, assume Ψ(r1,⋯,rN)=∏Ni=1ψi(ri) (hartree product, note that we may have K>N molecular spatial orbitals ψ1,⋯,ψK but we select the first N orbitals with the lowest energy. It is valid since ∫ψi(ri)dri=1 and ∫∏Ni=1ψi(ri)dr1⋯drN=1. This is an approximated wave function), then we minimize the total energy of (1) using variational method, and find that: with the assumed wave function, for electron i, we can approximate its interaction with other electrons by the average field of other electrons, and we add all the average field to approximate Vee. i.e., let Vee=∑ig(i). In this case we can let h′(i)=h(i)+g(i) so we can write the equation as ∑ih′(i)Ψ(r1,⋯,rN)=EΨ(r1,⋯,rN) and solve it easily as single-electron equations h′(i)ψi(ri)=ϵiψi(ri) (not so easy though, since g(i) includes the wave function of other electrons, and a self-consistent iteration is needed).
- (Hartree-Fock Approximation) For the second challenge, assume Ψ(x1,⋯,xN)=1√N!|χ1(x1)χ2(x1)⋯χN(x1)χ1(x2)χ2(x2)⋯χN(x2)⋮⋮⋱⋮χ1(xN)χ2(xN)⋯χN(xN)| (slater determinant), which is a better approximation of the wave function that satisfy antisymmetry principle. E.g., for the two-electron case, Ψ(x1,x2)=1√2|χ1(x1)χ2(x1)χ1(x2)χ2(x2)|=1√2[χ1(x1)χ2(x2)−χ2(x1)χ1(x2)]. It is easy to check that Ψ(x1,x2)=−Ψ(x2,x1), and if the two electrons are described by the same spin orbital χ=χ1=χ2, then Ψ(x1,x2)=0.
The simplest case (both Vee and antisymmetry principle ignored)
As mentioned before, if both Vee and antisymmetry principle are ignored, then we can just solve the following one-electron differential equation
h(i)ψi(ri)=ϵiψi(ri)The solution of a differential equation is a function. That is, we want to find some functions ψi(ri) (molecular orbicals) so that the above equation holds for all r1∈R3.
However, such equation can be hard to solve in function space. Therefore we want to expand the wave function (molecular orbical) as the linear conbination of some (let’s say, K) basis functions,
ψi(ri)=K∑u=1Cuiϕu(ri)Then the problem of solving ψ (in function space) transforms to solving K coefficients Ci=(C1i,⋯,CKi)T (in vector space RK), which can be easier to solve with linear algebra. Note that Ci is constrainted by ∫ψi(ri)2dri=1⇒∑u,vCuiCvi∫ϕu(ri)ϕv(ri)dri⏟Suv=1⇒CTiSCi=1
For the selection of basis function, a common practice is to use known atomic orbital of each atoms (molecular orbitals ψi are formed as a linear combination of atomic orbitals ϕui, or MO-LCAO, see section 2.2.5 of [1]). Atomic orbital is also a wave function which is usually known and have explicit form centered at each atoms. For example, the exact 1s orbital of a hydrogen atom is ϕ(r−R)=(ζ3/π)1/2e−ζ|r−R| where R is the center coodinate of the atom and ζ=1.0 (Slater orbital). However, for the sake of simpler integral evaluation (积分计算), we usually use Gaussian orbital which has the form ϕ(r−R)=(2α/π)3/4e−α|r−R|2.
Additionally, when the number of basis functions is K, later we will see that we can solve K molecular spatial orbitals ψ1,⋯,ψK (2K molecular spin orbitals χ1,⋯,χ2K) with orbital energies ϵ1<⋯<ϵK. For the ground state which has the lowest energy, the N electrons will occupy the lowest molecular spin orbitals, and leave the other 2K−N orbital unoccupied or virtual.
Now we see how to solve the equation (2) with the expansion ψi(ri)=∑Ku=1Cuiϕu(ri). Substitude the expansion into (2) we have
h(i)K∑v=1Cviϕv(ri)=ϵiK∑v=1Cviϕv(ri)Since we have K unknown variables C1i,⋯,CKi, if we can construct K normal equations then we can solve the K variables out. The important trick here is to multiple ϕu(ri) on the left for u=1,⋯,K and integrate (see page 29 of [1] for more details). Then we have K equations
∫K∑v=1Cviϕu(ri)h(i)ϕv(ri)dri=∫ϵiK∑v=1Cviϕu(ri)ϕv(ri)dri,∀u=1,⋯,KK∑v=1Cvi∫ϕu(ri)h(i)ϕv(ri)dri⏟Huv=ϵiK∑v=1Cvi∫ϕu(ri)ϕv(ri)dri⏟Suv,∀u=1,⋯,KSince both the basis functions (atomic orbitals) ϕ and h(i) are explicitly given, the integral ∫ϕu(ri)h(i)ϕv(ri)dri and ∫ϕu(ri)ϕv(ri)dri can be calculated as constants (later we will show the explicit expression of these integrals). We denote constant Huv=∫ϕu(ri)h(i)ϕv(ri)dri and Suv=∫ϕu(ri)ϕv(ri)dri (overlap integral), then we have
K∑v=1HuvCvi=ϵiK∑v=1SuvCvi,∀u=1,⋯,KWe can write the K equations together in matrix form as
(H11H12⋯H1KH21H22⋯H2K⋮⋮⋱⋮HK1HK2⋯HKK)(C1iC2i⋮CKi)=ϵi(S11S12⋯S1KS21S22⋯S2K⋮⋮⋱⋮SK1SK2⋯SKK)(C1iC2i⋮CKi)or
HCi=ϵiSCiwhere Ci=(C1i,⋯,CKi)T. If our basis functions are orthonormal (正交, that is, Suv=∫ϕu(r)ϕv(r)dr=δuv={1u=v0u≠v, see 1.2 of [1]), then S=I is an identity matrix. In this case we solve HCi=ϵiCi, which equals to finding K eigenvector C1,⋯,CK and eigenvalue ϵ1,⋯,ϵK of matrix H. Then we get K molecular spatial orbitals ψ1(r1)=∑Ku=1Cu1ϕu(r1),⋯,ψK(rK)=∑Ku=1CuKϕu(rK) in which both coefficients C and basis functions ϕ are known.
However, the basis functions are usually not orthonormal (S≠I), so what we actually need to solved is a generalized eigenvalue problem formed as Ax=λBx instead of Ax=λx. As stated in this tutorial, there is a quick & dirty way to solve the generalized problem: let C=B−1A and solve Cx=λx. However, there are two concerns about this method. The first is that C=B−1A may not be symmetric (so the final eigenvectors may not be orthogonal), so you need to use np.linalg.eig
in NumPy instead of np.linalg.eigh
. Usually symmetric matrix is more favorable in eigen decomposition. The second is that each eigenvector needs to be additionally scaled so as to satisfy xTBx=1 (recall that the orbital coefficient is constrainted by CTiSCi=1)
Here we use a more rigorous way to solve it. In this case we want to find a linear transformation X so that XTSX=I. If we can find such a transformation X, let C′i=X−1Ci (Ci=XC′i) so we have HXC′i=ϵiSXC′i. Multiply XT on the left and we have XTHX⏟=H′C′i=ϵiXTSX⏟=IC′i. Let H′=XTHX and we have H′C′i=ϵiC′i. Then we solve C′i as usual and do linear transformation Ci=XC′i (not orthogonal after the transformation) to get solution Ci of the original equation HCi=ϵiSCi.
The following are the steps to find the transformation X. We first do matrix diagonalization (矩阵对角化) to S, that is, find K eigenvectors U=[U1,⋯,UK] and eigenvalues s=[s1,⋯,sK] so that SU=Udiag(s)⇒UTSU=diag(s) (U is orthonormal and normalized so UT=U−1). Then let X=Udiag(s−1/2) in which s−1/2=[s−1/21,⋯,s−1/2K], we have XTSX=diag(s−1/2)UTSUdiag(s−1/2)=diag(s−1/2)diag(s)diag(s−1/2)=I (remind that (AB)T=BTAT). (see 3.4.5 of [1] for details)
The final step towards actual computation is to calculate Huv=∫ϕu(ri)h(i)ϕv(ri)dri and Suv=∫ϕu(ri)ϕv(ri)dri for Gaussian atomic orbitals ϕu(r−Ru)=(2α/π)3/4e−α|r−Ru|2 and h(i)=−12∇2i−∑AZArAi. While the integrals can be calculated analytically, the process is quite complicated. Here we list the results of the integrals directly from appendix A of [1], more details can be found in the appendix.
Suv=∫ϕu(ri)ϕv(ri)dri=(πp)3/2e−αβα+β|Ru−Rv|2(A.9)Huv=Tuv+∑AVAuvin whichTuv=∫ϕu(ri)−12∇2iϕv(ri)dri=αβα+β[3−2αβα+β|Ru−Rv|2]Suv(A.11)VAuv=∫ϕu(ri)−ZArAiϕv(ri)dri=−2πα+βZAe−αβα+β|Ru−Rv|2F0[(α+β)|Rp−Rv|2](A.33)F0(t)=12√πterf(√t)p=α+βRp=αRu+βRvα+βin which erf(t) is the Gauss error function.
Hartree approximation (Vee considered, antisymmetry principle ignored)
Preliminaries: variation principle
Before considering Vee, let’s derive HCi=ϵiSCi (4) directly from the Schrodinger equation (1) via variation principle with apprximated hartree product wave function Ψ(r1,⋯,rN)=∏iψi(ri).
The variation principle is as follows: for eigenvalue problem HΨ=EΨ in which H=∑Ni=1h(i) is a Hermitian operator and Ψ is a wave function, there exists an infinate set of exact solutions HΨα=EαΨα where E0≤E1≤⋯≤Eα≤⋯. Then, given an arbitrary normalized wave function ˜Ψ,∫|˜Ψ|2=1, the expectation of the Hamiltionian is an upper bound to the exact ground state enengy E0, that is, ∫˜ΨH˜Ψ≥E0. The equality holds only when ˜Ψ=Ψ0. (see section 1.3 of [1] for details)
Therefore, we can transform the solving of Ψ0 into a constrainted optimization problem: min. With \Psi(r_1, \cdots, r_N) = \prod_i \psi_i(r_i), \int |\Psi|^2 = 1 is transformed to N constraints \int |\psi_i|^2 = 1, i = 1, \cdots, N. The standard method to solve constrainted optimization problem is to construct a Lagrangian function \mathcal{L}(\Psi, E) = \int \Psi H \Psi - \sum_i E_i(\int |\psi_i|^2 - 1) and let \frac{\partial \mathcal{L}}{\partial \Psi} = 0.
we first calculate
\begin{align} & \int \Psi(r_1, \cdots, r_N) H \Psi(r_1, \cdots, r_N) dr_1\cdots dr_N \nonumber \\ =& \sum_{i=1}^N \int_{r_N}\cdots\int_{r_1} \psi(r_1)\cdots\psi(r_N) h(i) \psi(r_1)\cdots\psi(r_N) dr_1\cdots dr_N \nonumber \\ & \text{Note that $\int \psi_i(r)\psi_i(r) dr = 1$ and $h(i)$ is only related to $r_i$, so} \nonumber \\ =& \sum_{i=1}^N \int_{r_i} \psi_i(r_i)h(i)\psi_i(r_i) dr_i \label{E_h} \\ & \text{with the basis expansion $\psi_i(r_i) = \sum_{u=1}^K C_{ui} \phi_u(r_i)$} \nonumber \\ =& \sum_{i=1}^N \int \sum_{u, v}C_{ui}C_{vi}\phi_u(r_i)h(i)\phi_v(r_i) dr_i \nonumber \\ =& \sum_{i=1}^N \sum_{u, v}C_{ui}C_{vi} \underbrace{\int \phi_u(r_i)h(i)\phi_v(r_i) dr_i}_{H_{uv}} = \sum_{i=1}^N \sum_{u, v}C_{ui}C_{vi}H_{uv} \label{raw_hcore_energy} \\ \end{align}and
\begin{aligned} & \sum_{i=1}^N E_i(\int \psi_i(r_i)\psi_i(r_i) dr_i - 1) \\ =& \sum_{i=1}^N E_i(\sum_{u, v} C_{ui}C_{vi}\underbrace{\int \phi_u(r_i)\phi_v(r_i) dr_i}_{S_{uv}} - 1) \\ =& \sum_{i=1}^N E_i(\sum_{u, v} C_{ui}C_{vi}S_{uv} - 1) \end{aligned}Thus
\begin{align*} \mathcal{L}(C, E) &= \sum_{i=1}^N [\sum_{u, v}C_{ui}C_{vi}H_{uv} - E_i(\sum_{u, v} C_{ui}C_{vi}S_{uv} - 1)] \\ &= \sum_{i=1}^N [\sum_{u, v}C_{ui}C_{vi}(H_{uv} - E_i S_{uv}) + E_i]\\ \frac{\partial L}{\partial C_{ui}} &= \sum_{v \neq u} 2 C_{ui}(H_{uv} - E_i S_{uv}) + 2 C_{ui}(H_{uv} - E_i S_{uv}) = 0 \\ & 2\sum_v C_{vi}(H_{uv} - E_i S_{uv}) = 0 \\ & \sum_v H_{uv}C_{vi} = E_i \sum_v S_{uv}C_{vi}, \forall i = 1, \cdots, N, u = 1, \cdots, K \end{align*}which is identical to \eqref{simplest_raw} whose matrix form is HC_i = \epsilon_i SC_i \eqref{eq3}. (more details can be found in section 1.3.2 of [1] who assume the basis functions are orthogonal so S = I)
Hartree approximation with variation principle
Now let’s take V_{ee} = \sum_{i < j}\frac{1}{r_{ij}} into account, so we are going to solve
[\sum_i h(i) + \sum_{i < j}\frac{1}{r_{ij}}]\Psi(r_1, \cdots, r_N) = E\Psi(r_1, \cdots, r_N)in this case H = \sum_i h(i) + \sum_{i < j}\frac{1}{r_{ij}} = \sum_i h(i) + \frac{1}{2}\sum_{i \neq j}\frac{1}{r_{ij}}, so we have a new term \int \Psi (\frac{1}{2}\sum_{i \neq j}\frac{1}{r_{ij}}) \Psi. That is,
\begin{align} & \int \Psi(r_1, \cdots, r_N) (\frac{1}{2}\sum_{i \neq j} \frac{1}{r_{ij}}) \Psi(r_1, \cdots, r_N) dr_1\cdots dr_N \nonumber \\ =& \frac{1}{2}\sum_{i \neq j} \int_{r_N}\cdots\int_{r_1} \psi(r_1)\cdots\psi(r_N) \frac{1}{r_{ij}} \psi(r_1)\cdots\psi(r_N) dr_1\cdots dr_N \nonumber \\ =& \frac{1}{2}\sum_{i \neq j} \int_{r_i, r_j} \psi_i(r_i)\psi_j(r_j)\frac{1}{r_{ij}}\psi_j(r_j)\psi_i(r_i) dr_i dr_j \label{E_coulomb} \\ =& \frac{1}{2}\sum_{i \neq j} \sum_{u, v, \lambda, \sigma}C_{ui}C_{\lambda j}C_{\sigma j}C_{vi} \underbrace{\int \phi_u(r_i)\phi_\lambda(r_j)\frac{1}{r_{ij}}\phi_\sigma(r_j)\phi_v(r_i) dr_i dr_j}_{(uv|\lambda\sigma), \text{see (3.156) of [1]}} \label{E_coulomb_C} \\ \frac{\partial}{\partial C_{ui}} =& 2 \times \frac{1}{2} \sum_{j (\neq i)}\sum_{v, \lambda, \sigma}2 C_{vi}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) \nonumber \\ =& \sum_v 2C_{vi}[\sum_{j (\neq i)}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma)] \nonumber \end{align}(Here basis index u, v correspond to electron i, \lambda, \sigma correspond to electron j. Please note that we have a factor of 2 before the derivative, since \sum_{i, j}(i \neq j) has two indices which are both possible to equal to i. Imaging a square matrix with ith row and ith column)
Therefore
\begin{align} & \frac{\partial L}{\partial C_{ui}} = 2\sum_v C_{vi}(H_{uv} + \sum_{j (\neq i)}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) - E_i S_{uv}) = 0 \nonumber\\ & \sum_v (H_{uv} + \sum_{j (\neq i)}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma))C_{vi} = E_i \sum_v S_{uv}C_{vi}, \forall i = 1, \cdots, N, u = 1, \cdots, K \label{hartree_raw} \end{align}\eqref{hartree_raw} is the raw form of Hartree approximation. We will discuss how to transform it to matrix form in later section.
Now let’s see if we replace h(i) with h(i) + g(i) in which
g(i) = \sum_{j (\neq i)}\int \frac{|\psi_j(r_j)|^2}{r_{ij}} dr_j(notice that |\psi_j(r_j)|^2 is the possibility density of electron j that appears at coordinate r_j, \int \frac{|\psi_j(r_j)|^2}{r_{ij}} dr_j is the repulsion of electron j towards electron i, so g(i) is the average field of all other electrons), then the matrix form of [h(i) + g(i)]\psi_i(r_i) = \epsilon_i\psi_i(r_i) with basis expansion \psi_i(r_i) = \sum_{u=1}^K C_{ui} \phi_u(r_i) will result in the same result above. Similar to the simplest case discussed before, we write g(i)\psi_i(r_i) as g(i)\sum_v C_{vi} \phi_v(r_i) = \sum_v C_{vi} g(i)\phi_v(r_i), left multiplied by \phi_u(r_i) and integrate. We have
\begin{align*} &\sum_v C_{vi}\int \phi_u(r_i)g(i)\phi_v(r_i) dr_i \\ =& \sum_v C_{vi}\int \phi_u(r_i)[\sum_{j (\neq i)}\int \psi_j(r_j)\frac{1}{r_{ij}}\psi_j(r_j) dr_j]\phi_v(r_i) dr_i \\ =& \sum_v C_{vi}\int \phi_u(r_i)[\sum_{j (\neq i)}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}\int \phi_\lambda(r_j)\frac{1}{r_{ij}}\phi_\sigma(r_j) dr_j]\phi_v(r_i) dr_i \\ =& \sum_v C_{vi}\sum_{j (\neq i)}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j} \int \phi_u(r_i) \phi_\lambda(r_j)\frac{1}{r_{ij}}\phi_\sigma(r_j) \phi_v(r_i) dr_i dr_j \\ =& \sum_v C_{vi}\sum_{j (\neq i)}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j} (uv|\lambda\sigma) \end{align*}Hartree-Fock approximation
Now let’s replace the simple hartree product wave function \Psi(r_1, \cdots, r_N) = \prod_i \psi_i(r_i) with the better (single) slater determinant \Psi(x_1, \cdots, x_N) = \frac{1}{\sqrt{N!}}\begin{vmatrix}\chi_1(x_1) & \chi_2(x_1) & \cdots & \chi_N(x_1)\\ \chi_1(x_2) & \chi_2(x_2) & \cdots & \chi_N(x_2)\\ \vdots & \vdots & \ddots & \vdots\\ \chi_1(x_N) & \chi_2(x_N) & \cdots & \chi_N(x_N)\end{vmatrix} (x_i = (\omega_i, r_i) where \omega_i is the spin direction). For example, when N = 2, the hartree product wave function is \Psi(r_1, r_2) = \psi_1(r_1)\psi_2(r_2) while the slater determinant wave function will have the form of \Psi(x_1, x_2) = \frac{1}{2}[\chi_1(x_1)\chi_2(x_2) - \chi_1(x_2)\chi_2(x_1)].
Again, we consider the constrainted optimization problem: \min \int \Psi (\sum_i h(i) + \frac{1}{2}\sum_{i \neq j}\frac{1}{r_{ij}}) \Psi \text{ s.t. } \int |\Psi|^2 = 1. The only difference is that the wave function is replaced with a (single) slater determinant. Assuming that we already calculated the objective (expectation value of energy)
\begin{align} E_0 &= \int \Psi (\sum_i h(i) + \frac{1}{2}\sum_{i \neq j}\frac{1}{x_{ij}}) \Psi \nonumber\\ &= \underbrace{\sum_{i=1}^N \int \chi_i(x) h(i) \chi_i(x) dx}_{\eqref{E_h}\text{, the simplest case, or $\sum_i[i|h|i]$}} + \underbrace{\frac{1}{2}\sum_{i, j} \int \chi_i(x_1)\chi_i(x_1)\frac{1}{r_{12}}\chi_j(x_2)\chi_j(x_2) dx_1 dx_2}_{\eqref{E_coulomb}\text{, added in Hartree approx., or $\frac{1}{2}\sum_{i, j} [ii|jj]$}} \nonumber\\ &- \underbrace{\frac{1}{2}\sum_{i, j} \int \chi_i(x_1)\chi_j(x_1)\frac{1}{r_{12}}\chi_j(x_2)\chi_i(x_2) dx_1 dx_2}_{\text{New "exchange" term, or $\frac{1}{2}\sum_{i, j} [ij|ji]$}} \label{raw_total_energy} \end{align}This result can be found in (2.111) and (3.39) of [1]. Note that we replace spatial orbitals \psi_i(r) with spin orbitals \chi_i(x), and have several minor changes of notation for simplicity:
- r_i, r_j is replaced by r_1, r_2
- We write spin orbitals with r_1 before r_{12}^{-1}, and spin orbitals with r_2 after r_{12}^{-1}.
- \sum_{i \neq j} is replaced by \sum_{i, j}, since the i = j terms in the second term (coulomb) and third term (exchange) cancel each other out.
Now we are going to transit the spin orbitals to spatial orbitals. Note that \chi_i(x) = \alpha(\omega)\psi_i(r) \text{ or } \beta(\omega)\psi_i(r), x = (\omega, r) and the two spin wave functions \alpha(\omega) and \beta(\omega) are orthogonal (\int \alpha(\omega)\alpha(\omega)d\omega = \int \beta(\omega)\beta(\omega)d\omega = 1, \int \alpha(\omega)\beta(\omega)d\omega = 0)
By spliting x to \omega and r, since h(i) is only related with r and has no relationship with \omega, we can check that
\begin{align*} \int \chi_i(x) h(i) \chi_i(x) dx &= \int_r \int_\omega \alpha(\omega)\psi_i(r)h(i)\alpha(\omega)\psi_i(r) d\omega dr \\ &= \int_r [\underbrace{\int_\omega \alpha(\omega)\alpha(\omega)d\omega}_{=1}]\psi_i(r)h(i)\psi_i(r) dr \\ &= \int \psi_i(r)h(i)\psi_i(r) dr \end{align*}Replacing \alpha(\omega) with \beta(\omega) will not change the result. We can also use a simpler notation [i|h|i] = (i|h|i) ([] is for spin orbitals and () is for spatial orbitals)
In a similar way, for the second term (coulomb) we have \int \chi_i(x_1)\chi_i(x_1)\frac{1}{r_{12}}\chi_j(x_2)\chi_j(x_2) dx_1 dx_2 = \int \psi_i(r_1)\psi_i(r_1)\frac{1}{r_{12}}\psi_j(r_2)\psi_j(r_2) dr_1 dr_2 (or [ii|jj] = (ii|jj)). However, for the third term (exchange), things become complicated. We cannot compute it unless we know the exact spin direction (\alpha(\omega) or \beta(\omega)) for each spin orbital \chi_i.
Assuming we are in a closed-shell restricted scenario. That is, we have an even number of electrons, filling the spatial orbitals from bottom to up, each spatial orbitals contain two electrons with opposite spin, such as \chi_1(x_1) = \alpha(\omega_1)\psi_1(r_1), \chi_2(x_2) = \beta(\omega_1)\psi_1(r_1), \chi_3(x_3) = \alpha(\omega_3)\psi_2(r_3), \chi_4(x_4) = \beta(\omega_4)\psi_2(r_4), etc. In this case the first and second term become 2 \sum_{i=1}^{N/2} \int \psi_i(r)h(i)\psi_i(r) dr and 2 \sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \int \psi_i(r_1)\psi_i(r_1)\frac{1}{r_{12}}\psi_j(r_2)\psi_j(r_2) dr_1 dr_2 (2 = 2 \times 2 \times \frac{1}{2}) respectively. For the third term (exchange), both electrons i and j can be in spin \alpha or \beta, then we discuss it in four cases (ij\rightarrow\alpha\alpha, ij\rightarrow\beta\alpha, ij\rightarrow\alpha\beta, ij\rightarrow\beta\beta):
\begin{align*} & \frac{1}{2}\sum_{i=1}^N \sum_{j=1}^N \int \chi_i(x_1)\chi_j(x_1)\frac{1}{r_{12}}\chi_j(x_2)\chi_i(x_2) dx_1 dx_2 \\ =& \frac{1}{2}\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \int {\color{red}{\alpha(\omega_1)}}\psi_i(r_1){\color{red}{\alpha(\omega_1)}}\psi_j(r_1)\frac{1}{r_{12}}{\color{red}{\alpha(\omega_2)}}\psi_j(r_2){\color{red}{\alpha(\omega_2)}}\psi_i(r_2) d\omega_1 dr_1 d\omega_2 dr_2 \\ =& \frac{1}{2}\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \int {\color{blue}{\beta(\omega_1)}}\psi_i(r_1){\color{red}{\alpha(\omega_1)}}\psi_j(r_1)\frac{1}{r_{12}}{\color{red}{\alpha(\omega_2)}}\psi_j(r_2){\color{blue}{\beta(\omega_2)}}\psi_i(r_2) d\omega_1 dr_1 d\omega_2 dr_2 \\ =& \frac{1}{2}\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \int {\color{red}{\alpha(\omega_1)}}\psi_i(r_1){\color{blue}{\beta(\omega_1)}}\psi_j(r_1)\frac{1}{r_{12}}{\color{blue}{\beta(\omega_2)}}\psi_j(r_2){\color{red}{\alpha(\omega_2)}}\psi_i(r_2) d\omega_1 dr_1 d\omega_2 dr_2 \\ =& \frac{1}{2}\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \int {\color{blue}{\beta(\omega_1)}}\psi_i(r_1){\color{blue}{\beta(\omega_1)}}\psi_j(r_1)\frac{1}{r_{12}}{\color{blue}{\beta(\omega_2)}}\psi_j(r_2){\color{blue}{\beta(\omega_2)}}\psi_i(r_2) d\omega_1 dr_1 d\omega_2 dr_2 \end{align*}The second and third term will vanish since \int {\color{red}{\alpha(\omega)}}{\color{blue}{\beta(\omega)}}d\omega = 0. For the other terms, the spin wave function vanished, and the final result is \sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \int \psi_i(r_1)\psi_j(r_1)\frac{1}{r_{12}}\psi_j(r_2)\psi_i(r_2) dr_1 dr_2. This reflect the fact that there is only an exchange interaction between electrons of parallel spin. See details in Chapter 2.3.5 (page 81) and 3.4.1 (page 133) of [1].
Now we expand the exchange term with basis functions and take derivative
\begin{align} &\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \int \psi_i(r_1)\psi_j(r_1)\frac{1}{r_{12}}\psi_j(r_2)\psi_i(r_2) dr_1 dr_2 \nonumber\\ =& \sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}C_{ui}C_{vi}C_{\lambda j}C_{\sigma j}\underbrace{\int \phi_u(r_1)\phi_\lambda(r_1)\frac{1}{r_{12}}\phi_\sigma(r_2)\phi_v(r_2) dr_1 dr_2}_{(u\lambda|\sigma v)} \label{raw_exchange_energy}\\ \frac{\partial}{\partial C_{ui}} =& 4\sum_{j=1}^{N/2} \sum_{v, \lambda, \sigma} C_{vi}C_{\lambda j}C_{\sigma j} (u\lambda|\sigma v)\nonumber \end{align}Therefore
\begin{align} & \frac{\partial L}{\partial C_{ui}} = \sum_v C_{vi}(4 H_{uv} + 8 \sum_{j=1}^{N/2}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) - 4 \sum_{j=1}^{N/2} \sum_{\lambda, \sigma} C_{\lambda j}C_{\sigma j} (u\lambda|\sigma v) - E_i S_{uv}) = 0 \nonumber\\ & \sum_v [H_{uv} + \sum_{j=1}^{N/2} (2 \sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) - \sum_{\lambda, \sigma} C_{\lambda j}C_{\sigma j}(u\lambda|\sigma v))]C_{vi} = E_i \sum_v S_{uv}C_{vi} \label{hf_raw}\\ & \forall i = 1, \cdots, N, u = 1, \cdots, K \nonumber \end{align}\eqref{hf_raw} is the raw form of Hartree-Fock approximation. We will discuss how to transform it to matrix form in later section.
Exchange term as an operator
Now let’s see if it is possible to use an operator k(i) (similar to g(i) in Hartree approximation) to represent the additional exchange term in the above equation, so that the matrix form of [h(i) + g(i) + k(i)]\psi_i(r_i) = \epsilon_i\psi_i(r_i) with basis expansion \psi_i(r_i) = \sum_{u=1}^K C_{ui} \phi_u(r_i) will result in the same result above. Actually it is possible to define k(i) in the following way
k(i){\color{red}\psi_{\color{red}i}}(r_i) = \sum_{j=1}^{N/2}[\int {\color{blue}\psi_{\color{blue}j}}(r_j) r_{ij}^{-1} {\color{red}\psi_{\color{red}i}}(r_j)] {\color{blue}\psi_{\color{blue}j}}(r_i)compared with former g(i)
g(i){\color{red}\psi_{\color{red}i}}(r_i) = 2\sum_{j=1}^{N/2}[\int {\color{blue}\psi_{\color{blue}j}}(r_j) r_{ij}^{-1} {\color{blue}\psi_{\color{blue}j}}(r_j)] {\color{red}\psi_{\color{red}i}}(r_i)Similarly, we write k(i)\psi_i(r_i) as k(i)\sum_v C_{vi} \phi_v(r_i) = \sum_v C_{vi} k(i)\phi_v(r_i), left multiplied by \phi_u(r_i) and integrate. We have
\begin{align*} &\sum_v C_{vi}\int \phi_u(r_i)k(i)\phi_v(r_i) dr_i \\ =& \sum_v C_{vi}\int \phi_u(r_i)[\sum_{j=1}^{N/2}\int \psi_j(r_j)\frac{1}{r_{ij}}\phi_v(r_j) dr_j]\psi_j(r_i) dr_i \\ =& \sum_v C_{vi}\sum_{\lambda}C_{\lambda j}\int \phi_u(r_i)[\sum_{j=1}^{N/2}\sum_{\sigma}C_{\sigma j}\int \phi_\sigma(r_j)\frac{1}{r_{ij}}\phi_v(r_j) dr_j]\phi_\lambda(r_i) dr_i \\ =& \sum_v C_{vi}\sum_{j=1}^{N/2}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}\int \phi_u(r_i)\phi_\lambda(r_i)\frac{1}{r_{ij}} \phi_\sigma(r_j) \phi_v(r_j) dr_i dr_j \\ =& \sum_v C_{vi}\sum_{j=1}^{N/2}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j} (u\lambda|\sigma v) \end{align*}Matrix form and charge density matrix
Now we want to write the Hartree equation \eqref{hartree_raw} and Hartree-Fock equation \eqref{hf_raw} in matrix form. We restate them below:
\begin{align*} \sum_v (H_{uv} + \sum_{j (\neq i)}\sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma))C_{vi} &= E_i \sum_v S_{uv}C_{vi} \\ \sum_v [H_{uv} + \sum_{j=1}^{N/2} (2 \sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) - \sum_{\lambda, \sigma} C_{\lambda j}C_{\sigma j}(u\lambda|\sigma v))]C_{vi} &= E_i \sum_v S_{uv}C_{vi}\\ \forall i = 1, \cdots, N, u = 1, \cdots, K \end{align*}We focus on Hartree-Fock equation below. Recall that in the simplest case we got HC_i = \epsilon_i SC_i \eqref{eq3} from \sum_{v=1}^K H_{uv} C_{vi} = \epsilon_i \sum_{v=1}^K S_{uv} C_{vi}, \forall u = 1, \cdots, K \eqref{simplest_raw}, u and v are row and column indices respectively. Also recall that in coulomb term \eqref{E_coulomb_C}, basis index u, v correspond to the first electron i, \lambda, \sigma correspond to the second electron j. Let
\begin{align*} G_{uv} &= \sum_{j=1}^{N/2} (2 \sum_{\lambda, \sigma}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) - \sum_{\lambda, \sigma} C_{\lambda j}C_{\sigma j}(u\lambda|\sigma v)) \\ &= \sum_{\lambda, \sigma}\underbrace{(2\sum_{j=1}^{N/2} C_{\lambda j}C_{\sigma j})}_{P_{\lambda\sigma}}(uv|\lambda\sigma) - \frac{1}{2}\sum_{\lambda, \sigma}\underbrace{(2\sum_{j=1}^{N/2} C_{\lambda j}C_{\sigma j})}_{P_{\lambda\sigma}}(u\lambda|\sigma v) \end{align*}Here we move the sum operator over all other electrons \sum_{j=1}^{N/2} to the inner of the term, and define
\begin{equation} P_{\lambda\sigma} = 2\sum_{j=1}^{N/2} C_{\lambda j}C_{\sigma j} \label{P_def} \end{equation}(Or in matrix form P = 2CC^T in which C is a K \times N/2 matrix. Here P is called density matrix. We will discuss the physical meaning of P in later section), thus
\begin{equation} G_{uv} = \sum_{\lambda, \sigma}P_{\lambda\sigma}(uv|\lambda\sigma) - \frac{1}{2}\sum_{\lambda, \sigma}P_{\lambda\sigma}(u\lambda|\sigma v) \label{G_def} \end{equation}or, in a NumPy way,
np.einsum("uvkl, kl -> uv", eri, D) - 0.5 * np.einsum("ukvl, kl -> uv", eri, D)
Now we can write Hartree-Fock equation in matrix form, similar to \eqref{eq3}, that is,
(H + G)C_i = \epsilon_i SC_iOr more concisely, define Fock matrix F = H + G (see (3.154) of [1]), we have
\begin{equation} FC_i = \epsilon_i SC_i \label{HF} \end{equation}which is also a (generalized) eigen decomposition in which we solve K eigenvalue \epsilon_1, \cdots, \epsilon_K and K elgenvectors C_1, \cdots, C_K for the Fock matrix F. This equation is usually called Roothaan equation.
Self-Consistent Field (SCF)
However, note that we cannot directly solve \eqref{HF} as stated in the simplest case \eqref{eq3}. Different from H in \eqref{eq3} which is a known constant, F is dependent on G, G is dependent on P \eqref{G_def}, P is dependent on C_1, \cdots, C_{N/2} \eqref{P_def}. In a word, F is a function of C (F = F(C)), but C is just the result that we want to solve. So this becomes a chicken-egg problem. To solve \eqref{HF} we need C to construct Fock matrix F, to obtain C we need to solve \eqref{HF}, …, which is a dead lock.
In reality we can try Self-Consistent Field (SCF) method to solve the equation. Simply speaking, we randomly choose a C^0 so that we can construct F and solve \eqref{HF} to obtain a new C^1. Iterate the process until the new generated C^i is nearly the same as the previous C^{i-1}. In implementation we actually use P to replace C in the aforementioned iteration process. That is:
- Choose an initial P
- iterate until |P - P’| < \text{threshold}:
- Compute Fock matrix F = H + G(P) following \eqref{G_def}
- Solve FC_i = \epsilon_i SC_i \eqref{HF} via the procedure introduced in previous section, obtain K elgenvectors C_1, \cdots, C_K whose eigenvalue is in ascending order.
- Store previous density matrix P’ = P and compute new P = 2CC^T, in which C = [C_1, \cdots, C_{N/2}] is a K \times N/2 matrix consisting of the first N/2 eigenvectors.
Unfortunately, SCF is not guaranteed to converge. The selection of initial value and the tricks in iteration process (such as DIIS) is critical to its convergence.
Charge density matrix
Note that P_{\lambda\sigma} in \eqref{P_def} actually has physical meaning. Remind that \psi_i(r) (orbital) is the wave function of a single electron. Therefore \rho_i(r) = \psi_i^2(r) is the probability density that this electron shows at coodinate r, \int_{r} \psi^2_i(r) dr = 1. Then
\begin{align*} \rho_i(r) = \psi_j^2(r) &= (\sum_{\lambda=1}^K C_{\lambda j} \phi_\lambda(r))(\sum_{\sigma=1}^K C_{\sigma j} \phi_\sigma(r)) \\ &= \sum_{\lambda\sigma}C_{\lambda j}C_{\sigma j} \phi_\lambda(r)\phi_\sigma(r) \end{align*}Since K basis function \phi_1(r), \cdots, \phi_K(r) are given, the above expression shows that, if we know K variables C_{1j}, \cdots, C_{Kj} then we can get the probability density function of a single orbital \psi_j^2(r).
So how many variables do we need if we want the total density function of all N electrons. That is \rho(r) = 2\sum_{j=1}^{N/2}\rho_j(r) (\int_{r} \rho(r) dr = N)? A simple answer is (N/2) \times K matrix of C. However,
\begin{align*} \rho(r) = 2\sum_{j=1}^{N/2}\psi_j^2(r) &= 2\sum_{j=1}^{N/2}\sum_{\lambda\sigma}C_{\lambda j}C_{\sigma j} \phi_\lambda(r)\phi_\sigma(r) \\ &= \sum_{\lambda\sigma}(2\sum_{j=1}^{N/2}C_{\lambda j}C_{\sigma j}) \phi_\lambda(r)\phi_\sigma(r) \\ &= \sum_{\lambda\sigma}P_{\lambda\sigma} \phi_\lambda(r)\phi_\sigma(r) \end{align*}So besides C, a K \times K matrix P is also sufficient to describe the total density function \rho(r). For this reason we call P the (charge) density matrix.
Total Energy
Now we review the total energy \eqref{raw_total_energy} and try to write it in a more elegant way using density matrix P (P_{\lambda\sigma} = 2\sum_{j=1}^{N/2} C_{\lambda j}C_{\sigma j} \eqref{P_def})and two-centered integral of basis functions (uv|\lambda\sigma) = \int \phi_u(r_i)\phi_\lambda(r_j)\frac{1}{r_{ij}}\phi_\sigma(r_j)\phi_v(r_i) dr_i dr_j.
E = E_{Hcore} + E_{coulomb} + E_{exchange} in which
\begin{align*} E_{Hcore} &= \sum_{i=1}^N \sum_{u, v}C_{ui}C_{vi}H_{uv} \quad \eqref{raw_hcore_energy}\\ &= 2\sum_{i=1}^{N/2} \sum_{u, v}C_{ui}C_{vi}H_{uv} \quad \text{(for closed-shell restricted scenario)} \\ &= \sum_{u, v}(2\sum_{i=1}^{N/2}C_{ui}C_{vi})H_{uv} = \sum_{u, v}P_{uv}H_{uv}\\ E_{coulomb} &= \frac{1}{2}\sum_{i=1}^{N} \sum_{j=1}^{N} \sum_{u, v, \lambda, \sigma}C_{ui}C_{\lambda j}C_{\sigma j}C_{vi}(uv|\lambda\sigma) \quad \eqref{E_coulomb_C} \\ &= 2\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}C_{ui}C_{\lambda j}C_{\sigma j}C_{vi}(uv|\lambda\sigma) \quad \text{(for closed-shell restricted scenario)} \\ &= \frac{1}{2}\sum_{u, v, \lambda, \sigma}(2\sum_{i=1}^{N/2}C_{ui}C_{vi})(2\sum_{j=1}^{N/2}C_{\lambda j}C_{\sigma j})(uv|\lambda\sigma) = \frac{1}{2}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(uv|\lambda\sigma)\\ E_{exchange} &= -\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}C_{ui}C_{vi}C_{\lambda j}C_{\sigma j}(u\lambda|\sigma v) \quad \eqref{raw_exchange_energy} \\ &= -\frac{1}{4} \sum_{u, v, \lambda, \sigma} (2\sum_{i=1}^{N/2}C_{ui}C_{vi})(2\sum_{j=1}^{N/2}C_{\lambda j}C_{\sigma j})(u\lambda|\sigma v) = -\frac{1}{4}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(u\lambda|\sigma v) \end{align*}Therefore
E = \sum_{u, v}P_{uv}H_{uv} + \frac{1}{2}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(uv|\lambda\sigma) - \frac{1}{4}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(u\lambda|\sigma v)Or, in a NumPy way,
E_elec = (HC * D).sum() + 0.5 * np.einsum("uvkl, uv, kl ->", eri, D, D) \\
- 0.25 * np.einsum("ukvl, uv, kl ->", eri, D, D)
in Py-xdh documentation. The usage of np.einsum
can be found here (In explicit mode with ->
).
Note that with Born-Oppenheimer approximation, we assume that the nuciels are fixed, static points with given coordinate R, thus we ignored the constant energy of nucleis (kinetic energy of nucleis T_N(R) and Coulombic repulsion between nucleis V_{NN}(R))in all the above discussions (see problem statement). However, when talking about the term “total energy”, these constant energies should be included, that is,
E_{nuc} = T_N(R) + V_{NN}(R) = 0 + \sum_{A < B}\frac{Z_A Z_B}{R_{AB}} = \frac{1}{2}\sum_{A, B}\frac{Z_A Z_B}{R_{AB}}which is quite easy to calculate using atomic number Z_A and nuclei coordinate R_A, without any basis or integral. Finally,
\begin{align} E_{total} &= E + E_{nuc} \nonumber \\ &= \sum_{u, v}P_{uv}H_{uv} + \frac{1}{2}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(uv|\lambda\sigma) - \frac{1}{4}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(u\lambda|\sigma v) + \frac{1}{2}\sum_{A, B}\frac{Z_A Z_B}{R_{AB}} \label{E_total} \end{align}Thomas-Fermi theory
From the last section, we notice that the electron density \rho(r) (or density matrix P) plays an important role. Expecially the G matrix \eqref{G_def} is a function of P. Then can we skip the wave function (\psi(r), C) and directly calculate through the electron density (\rho(r), P)? Thomas-Fermi theory shows such a possibility.
Preliminaries: electron density for N-electron wavefunction
For a single-electron wave function \psi_i(r), we know that \rho_i(r) = \psi_i^2(r) is the probability density that this electron shows at coodinate r, \int_{r} \psi^2_i(r) dr = 1. Then, what will be the electron density for N-electron wavefunction \Psi(r_1, \cdots, r_N)? Actually, it is similar to joint probability distribution f_{X, Y}(x, y) and marginal distribution f_{X}(x) = \int_y f_{X, Y}(x, y)dy for random variable X. That is, \rho(r_1, \cdots, r_N) = \Psi^2(r_1, \cdots, r_N) is the probability density that electron 1 shows at coordinate r_1, electron 2 shows at coordinate r_2, …, electron N shows at coordinate r_N. \int_{r_1, \cdots, r_N} \Psi^2(r_1, \cdots, r_N)dr_1, \cdots, dr_N = 1. The (marginal) probability density that electron i appears at coordinate r is that
\rho_i({\color{red}r}) = \int_{r_1, \cdots, r_{i-1}, r_{i+1}, \cdots, r_N} \Psi^2(r_1, \cdots, r_{i-1}, {\color{red}r}, r_{i+1}, \cdots, r_N) dr_1, \cdots, dr_{i-1}, dr_{i+1}, \cdots, dr_Nand we define (total) electron density \rho(r) simply as
\rho(r) = \sum_{i=1}^N \rho_i(r)which is NOT a probability density function but a sum of N probability density functions. \int \rho(r) dr = \sum_{i=1}^N \int \rho_i(r)dr = N.
Expressing total energy with electron density
The key of Thomas-Fermi theory is to express the total energy \int \Psi (T_e + V_{eN} + V_{ee})\Psi as a function of electron density \rho(r).
- T_e (kinetic energy): \int \Psi T_e \Psi = \int \Psi (-\frac{1}{2}\sum_i\nabla_i^2) \Psi hard to be expressed with \rho(r), but can be estimated by T_{TF}(\rho) = C_F\int\rho^{5/3}(r)dr in which C_F = \frac{3}{10}(3\pi^2)^{2/3} \simeq 2.871
-
V_{eN} (electrostatic potential between electrons and nuciels, or external potential. Can also be denoted as V_{ext}):
\begin{align*} \int \Psi V_{eN} \Psi &= \int \Psi(\sum_A\sum_i-\frac{Z_A}{r_{Ai}}) \Psi \\ &= \sum_{i=1}^N\int_{r_1, \cdots, r_N} \Psi(r_1, \cdots, r_N) (\sum_A-\frac{Z_A}{r_{Ai}}) \Psi(r_1, \cdots, r_N) dr_1\cdots dr_N \\ &= \sum_{i=1}^N\int_{r_i} \underbrace{(\sum_A-\frac{Z_A}{r_{Ai}})}_{v(i)} \underbrace{(\int_{r_1, \cdots, r_{i-1}, r_{i+1}, \cdots, r_N} \Psi^2(r_1, \cdots, r_N) dr_1\cdots dr_{i-1}, dr_{i+1}, \cdots, dr_N)}_{\rho_i(r_i)} dr_i \\ &= \sum_{i=1}^N\int_r v(i)\rho_i(r)dr = \int_r v(i)\underbrace{(\sum_{i=1}^N\rho_i(r))}_{\rho(r)}dr \\ &= \int v(i)\rho(r)dr \end{align*}in which v(i) = \sum_A-\frac{Z_A}{r_{Ai}}. remind that h(i) = -\frac{1}{2}\nabla_i^2 - \sum_{A}\frac{Z_A}{r_{Ai}} = -\frac{1}{2}\nabla_i^2 + v(i). In this way, V_{eN} can be precisely expressed via electron density \rho(r), no matter how the N-electron wave function \Psi is approximated.
-
V_{ee} (electrostatic repulsion between electrons): while \int \Psi (\frac{1}{2}\sum_{i \neq j}\frac{1}{r_{ij}}) \Psi is hard to tackle as before, we use Hartree approximation in which \Psi is approximated as the product of N single-electron wavefunction (orbitals) \Psi(r_1, \cdots, r_N) = \psi(r_1)\cdots\psi(r_N). We also relax \sum_{i \neq j} as \sum_{i, j} (\text{i.e., }\sum_i\sum_j)
\begin{align*} & \int \Psi(r_1, \cdots, r_N) (\frac{1}{2}\sum_{i, j} \frac{1}{r_{ij}}) \Psi(r_1, \cdots, r_N) dr_1\cdots dr_N \nonumber \\ =& \frac{1}{2}\sum_{i, j} \int_{r_N}\cdots\int_{r_1} \psi(r_1)\cdots\psi(r_N) \frac{1}{r_{ij}} \psi(r_1)\cdots\psi(r_N) dr_1\cdots dr_N \nonumber \\ =& \frac{1}{2}\sum_{i, j} \int_{r_i, r_j} \psi_i(r_i)\psi_j(r_j)\frac{1}{r_{ij}}\psi_j(r_j)\psi_i(r_i) dr_i dr_j \\ =& \frac{1}{2} \int_{r_i, r_j} \underbrace{(\sum_{i=1}^N\psi^2_i(r_i))}_{\rho(r_i)} \underbrace{(\sum_{j=1}^N\psi^2_j(r_j))}_{\rho(r_j)}\frac{1}{r_{ij}} dr_i dr_j \\ =& \frac{1}{2} \int_{r_i, r_j} \frac{\rho(r_i)\rho(r_j)}{r_{ij}} dr_i dr_j \end{align*}
Adding all three energies above, we get the total energy with Thomas-Fermi approximation (denoted as \langle H \rangle_{TF})
\langle H \rangle_{TF} = C_F\int\rho^{5/3}(r)dr + \int v(i)\rho(r)dr + \frac{1}{2} \int_{r_i, r_j} \frac{\rho(r_i)\rho(r_j)}{r_{ij}} dr_i dr_jDirac added a term for the exchange energy E_x(\rho) = -C_x \int \rho^{4/3}(r)dr in which C_x = \frac{3}{4}(\frac{3}{\pi})^{1/3}, and
\langle H \rangle_{TFD} = \langle H \rangle_{TF} + E_xThomas-Fermi(-Dirac) theory shows that the total energy can be approximated as a function of electron density \rho(r).
Density Functional Theory (DFT)
Preliminaries:
Appendix
Integral evaluation
More integrals can be found in Appendix A of [1].
For overlap integral S_{uv} = \int \phi_u(r_i)\phi_v(r_i)dr_i with Gaussian atomic orbitals \phi_u(r-R_u) = (2\alpha/\pi)^{3/4}e^{-\alpha|r-R_u|^2}, we first show that the product of two Gaussians is also a Gaussian. That is
\begin{aligned} & \phi_u(r-R_u) \phi_v(r - R_v) \\ = & e^{-\alpha|r - R_u|^2} e^{-\beta|r - R_v|^2} \\ = & e^{-(\alpha|r - R_u|^2 + \beta|r - R_v|^2)} \\ = & e^{-[\alpha(r_x - R_{u,x})^2 + \beta(r_x - R_{v,x})^2 + \alpha(r_y - R_{u,y})^2 + \beta(r_y - R_{v,y})^2 + \alpha(r_z - R_{u,z})^2 + \beta(r_z - R_{v,z})^2]} \\ = & e^{-[(\alpha + \beta)r_x^2 - 2(\alpha R_{u,x} + \beta R_{v,x}) + \alpha R_{u,x}^2 + \beta R_{u,y}^2 + \cdots]} \\ = & e^{-[(\alpha + \beta)(r_x - \frac{\alpha R_{u,x} + \beta R_{v,x}}{\alpha + \beta})^2 + \alpha R_{u,x}^2 + \beta R_{u,y}^2 - \frac{(\alpha R_{u,x} + \beta R_{v,x})^2}{\alpha + \beta} + \cdots]} \\ = & e^{-[(\alpha + \beta)(r_x - \frac{\alpha R_{u,x} + \beta R_{v,x}}{\alpha + \beta})^2 + \cdots]} e^{(\alpha + \beta)(\alpha R_{u,x} + \beta R_{v,x}) - (\alpha R_{u,x} + \beta R_{v,x})^2] / (\alpha + \beta) + \cdots} \\ = & e^{-[(\alpha + \beta)(r_x - \frac{\alpha R_{u,x} + \beta R_{v,x}}{\alpha + \beta})^2 + \cdots]} e^{-\frac{\alpha\beta}{\alpha + \beta}(R_{u,x}-R_{v,x})^2 + \cdots} \\ = & e^{-p|r - R_p|^2} \tilde{K} \\ = & \tilde{K} \phi_p(r - R_p) \end{aligned}in which p = \alpha + \beta, R_p = \frac{\alpha R_{u} + \beta R_{v}}{\alpha + \beta} is the center of the new Gaussian, and \tilde{K} = e^{-\frac{\alpha\beta}{\alpha + \beta}|R_u - R_v|^2}.
Then
S_{uv} = \int \phi_u(r_i)\phi_v(r_i)dr_i = \tilde{K} \int \phi_p(\textbf{r}_i - \textbf{R}_p) d\textbf{r}_i = \tilde{K} \int e^{-p|\textbf{r}_i - \textbf{R}_p|^2} d\textbf{r}_iLet \textbf{r} = \textbf{r}_i - \textbf{R}_p, reminds that triple integrals in spherical coordinates d\textbf{r} = (r\sin\varphi d\theta)(r d\varphi)dr = r^2\sin\varphi dr d\varphi d\theta, then
\begin{aligned} \int e^{-p|\textbf{r}_i - \textbf{R}_p|^2} d\textbf{r}_i &= \int e^{-p|\textbf{r}|^2} d\textbf{r} \\ &= \int_{r=0}^{+\infty} \int_{\varphi = 0}^\pi \int_{\theta = 0}^{2\pi} e^{-pr^2} r^2\sin\varphi dr d\varphi d\theta \\ &= 4\pi \int_{r=0}^{+\infty} e^{-pr^2} r^2 dr \end{aligned}To calculate \int_{r=0}^{+\infty} e^{-pr^2} r^2 dr, reminds that the Gaussian distribution \int_{x=-\infty}^{\infty} \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{x^2}{2\sigma^2}} = 1. Let \sigma = \sqrt{\frac{1}{2\alpha}} then we have a more general form \int_{x=-\infty}^{\infty} e^{-\alpha x^2} = \sqrt{\frac{\pi}{\alpha}}. Different each side by parameter \alpha we have \int_{x=-\infty}^{\infty} -x^2 e^{-\alpha x^2} = \frac{\sqrt{\pi}}{-2\alpha^{3/2}}, therefore \int_{x=-\infty}^{\infty} x^2 e^{-\alpha x^2} = \frac{\sqrt{\pi}}{2\alpha^{3/2}} and \int_{x=0}^{\infty} -x^2 e^{-\alpha x^2} = \frac{\sqrt{\pi}}{4\alpha^{3/2}} by symmetry. The final result of S_{uv} is
S_{uv} = 4\pi \tilde{K} \frac{\sqrt{\pi}}{4p^{3/2}} = (\frac{\pi}{p})^{3/2} e^{-\frac{\alpha\beta}{\alpha + \beta}|R_u - R_v|^2}Derivative of energy w.r.t nuclei coordinates (Geometry Optimization)
More details can be found in appendix C of [1] and [4].
Generally speaking, the total energy E_{total} (notation simplified as E in the following text) can be seen as a function of nuclei coordinates R = (R_1, R_2, \cdots) (denoted as E(R)). The coordinate-energy curve (surface) is called potential energy surface (PES). In geometry optimization, we are interested in finding the global minima of E(R), that is, R^* = \arg\min_R E(R). In order to find the minima efficiently, we are going to achieve a mission impossible – differentiate E w.r.t R, i.e., \frac{\partial E}{\partial R}.
Recall \eqref{E_total}
E = \sum_{u, v}P_{uv}H_{uv} + \frac{1}{2}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(uv|\lambda\sigma) - \frac{1}{4}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(u\lambda|\sigma v) + \frac{1}{2}\sum_{A, B}\frac{Z_A Z_B}{R_{AB}}Considering P_{uv} = 2\sum_{i=1}^{N/2} C_{ui}C_{vi} is just a function of occupied molecular orbital coefficient C, and R_{AB} = |R_A - R_B|, we get that E is a function of
- C_{ui}, which may have implicit dependence with R.
- H_{uv} = \int \phi_u(r)h(1)\phi_v(r)dr. From integral evaluation we know that the gaussian atomic orbital \phi_u(r, R_{A(u)}) = \phi_u(r-R_{A(u)}) = (2\alpha/\pi)^{3/4}e^{-\alpha|r-R_{A(u)}|^2} is a function of both r and nuclei coordinate R_{A(u)}, in which A(u) is the corresponding nuclei of the uth atomic orbital. In this case, we write H_{uv} as H_{uv} = \int \phi_u(r, R_{A(u)})h(1)\phi_v(r, R_{A(v)})dr, which has a clear dependence of nuclei coordinate R.
- (uv|\lambda\sigma) = \int \phi_u(r_1)\phi_v(r_1)\frac{1}{r_{12}}\phi_\lambda(r_2)\phi_\sigma(r_2) dr_1 dr_2. Similarly, we can write it as \int \phi_u(r_1, R_{A(u)})\phi_v(r_1, R_{A(v)})\frac{1}{r_{12}}\phi_\lambda(r_2, R_{A(\lambda)})\phi_\sigma(r_2, R_{A(\sigma)}) dr_1 dr_2, which has a clear dependence of nuclei coordinate R.
- R_A, the nuclei coordinate itself.
- Z_A, which is not correlated with nuclei coordinate R at all.
To compute \frac{\partial E}{\partial R}, we first compute the derivative of the first term, i.e., \frac{\partial E_{Hcore}}{\partial R} as an example. Remind that \frac{\partial}{\partial x} [a(x)b(x)c(x)d(x)\cdots] = \frac{\partial a(x)}{\partial x}b(x)c(x)d(x)\cdots + a(x)\frac{\partial b(x)}{\partial x}c(x)d(x)\cdots + \cdots
\begin{align*} E_{Hcore} &= \sum_{u, v}P_{uv}H_{uv} = 2\sum_{i=1}^{N/2} \sum_{u, v}C_{ui}C_{vi}H_{uv} \\ \frac{\partial E_{Hcore}}{\partial R_A} &= 2\sum_{i=1}^{N/2} \sum_{u, v}(\underbrace{\frac{\partial C_{ui}}{\partial R_A}C_{vi}H_{uv} + C_{ui}\frac{\partial C_{vi}}{\partial R_A}H_{uv}}_{\text{symmetry}} + C_{ui}C_{vi}\frac{\partial H_{uv}}{\partial R_A}) \\ &= 4\sum_{i=1}^{N/2} \sum_{u, v}\frac{\partial C_{ui}}{\partial R_A}C_{vi}H_{uv} + \sum_{u, v}P_{uv}\frac{\partial H_{uv}}{\partial R_A} \\ E_{coulomb} &= 2\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}C_{ui}C_{vi}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) \\ \frac{\partial E_{coulomb}}{\partial R_A} &= 8\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}\frac{\partial C_{ui}}{\partial R_A}C_{vi}C_{\lambda j}C_{\sigma j}(uv|\lambda\sigma) + 2\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}C_{ui}C_{\lambda j}C_{\sigma j}C_{vi}\frac{\partial (uv|\lambda\sigma)}{\partial R_A} \\ &= 4\sum_{i=1}^{N/2} \sum_{u, v, \lambda, \sigma}\frac{\partial C_{ui}}{\partial R_A}C_{vi}P_{\lambda\sigma}(uv|\lambda\sigma) + \frac{1}{2}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}\frac{\partial (uv|\lambda\sigma)}{\partial R_A} \\ E_{exchange} &= -\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}C_{ui}C_{vi}C_{\lambda j}C_{\sigma j}(u\lambda|\sigma v) \\ \frac{\partial E_{exchange}}{\partial R_A} &= -4\sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}\frac{\partial C_{ui}}{\partial R_A}C_{vi}C_{\lambda j}C_{\sigma j}(u\lambda|\sigma v) - \sum_{i=1}^{N/2} \sum_{j=1}^{N/2} \sum_{u, v, \lambda, \sigma}C_{ui}C_{\lambda j}C_{\sigma j}C_{vi}\frac{\partial (uv|\lambda\sigma)}{\partial R_A} \\ &= -2\sum_{i=1}^{N/2} \sum_{u, v, \lambda, \sigma}\frac{\partial C_{ui}}{\partial R_A}C_{vi}P_{\lambda\sigma}(u\lambda|\sigma v) - \frac{1}{4}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}\frac{\partial (u\lambda|\sigma v)}{\partial R_A} \\ E_{nuc} &= V_{NN}(R) = \frac{1}{2}\sum_{A \neq B}\frac{Z_A Z_B}{R_{AB}} = \frac{1}{2}\sum_{A \neq B}\frac{Z_A Z_B}{\sqrt{(R_{A, x} - R_{B, x})^2 + (R_{A, y} - R_{B, y})^2 + (R_{A, z} - R_{B, z})^2}}\\ \frac{\partial E_{nuc}}{\partial R_{A, x}} &= \frac{1}{2} \cdot 2 Z_A \sum_{B(\neq A)} Z_B \cdot -(R_{AB})^{-2} \cdot \frac{1}{2} (R_{AB}^2)^{1/2} \cdot 2(R_{A, x} - R_{B, x}) \quad \text{(similar for } R_{A, y} \text{ and } R_{A, z} \text{)}\\ \frac{\partial E_{nuc}}{\partial R_A} &= Z_A \sum_{B(\neq A)}\frac{Z_B (R_B - R_A)}{R_{AB}^3} \\ \frac{\partial E}{\partial R_A} &= \frac{\partial E_{Hcore}}{\partial R_A} + \frac{\partial E_{coulomb}}{\partial R_A} + \frac{\partial E_{exchange}}{\partial R_A} + \frac{\partial E_{nuc}}{\partial R_A} \\ &= 4\sum_{i=1}^{N/2} \sum_{u, v}\frac{\partial C_{ui}}{\partial R_A}C_{vi}\underbrace{(H_{uv} + P_{\lambda\sigma}(uv|\lambda\sigma) - \frac{1}{2}P_{\lambda\sigma}(u\lambda|\sigma v))}_{F_{uv}, \text{see} \eqref{G_def}} \\ &+ \sum_{u, v}P_{uv}\frac{\partial H_{uv}}{\partial R_A} + \frac{1}{2}\sum_{u, v, \lambda, \sigma}P_{uv}P_{\lambda\sigma}(\frac{\partial (uv|\lambda\sigma)}{\partial R_A} - \frac{1}{2}\frac{\partial (u\lambda|\sigma v)}{\partial R_A}) + Z_A \sum_{B(\neq A)}\frac{Z_B (R_B - R_A)}{R_{AB}^3} \end{align*}Hartree-Fock and Mean-Field
https://nukephysik101.wordpress.com/2017/07/12/hartree-fock-and-mean-field/
Density Fitting
https://github.com/psi4/psi4numpy/blob/master/Tutorials/03_Hartree-Fock/density-fitting.ipynb
References
- Szabo, A., & Ostlund, N. S. (1982). Modern quantum chemistry: introduction to advanced electronic structure theory. Dover Publication.
- Sherrill, C. D. (2000). An introduction to Hartree-Fock molecular orbital theory. School of Chemistry and Biochemistry, Georgia Institute of Technology.
- https://gitlab.com/LLindenbauer/Hartree-Fock-Tutorial
- https://py-xdh.readthedocs.io/zh_CN/latest/