{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# QMMD Usage Demonstration\n",
    "\n",
    "This is a demonstration of using `QMMD` to automate the quantum mechanical calculations and molecular dynamics simulations.\n",
    "\n",
    "The directory and file paths in this notebook are set up for the notebook to be run from sphractal/docs/. If this is not the case, the path needs to be changed accordingly for the cells to be executed properly. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Quantum Mechanical Calculations\n",
    "`QMMD` can be used to automatically:\n",
    "* group individual `xyz` coordinate files into individual directories, \n",
    "* generate job scripts for the `Gaussian` software commonly used for quantum mechanical calculations of small molecules,\n",
    "* generate job scripts for a given high-performance computing (HPC) system, such as Gadi at National Computational Infrastructure (NCI) in Australia, \n",
    "* submit the generated HPC job scripts to the HPC scheduler.\n",
    "* tabulate quantities of interest from Gaussian output files to an Excel document with different sheets."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generation of Gaussian job scripts and HPC submission scripts\n",
    "Say we have a directory containing xyz files that we want to conduct quantum mechanical calculations on. The directory could have just 1, or 100,000,000... files, the bottleneck is simply how many your system can hold. But importantly, it should only contain xyz files, that you want to run Gaussian jobs on.\n",
    "\n",
    "We will create the directory with example xyzs in it right now:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from qmmd.datasets import genExampleXYZs\n",
    "\n",
    "inpDirPath = './exampleXYZs'\n",
    "genExampleXYZs(inpDirPath)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "exampleXYZs:\n",
      "\u001b[0m\u001b[01;32mexample1.xyz\u001b[0m  \u001b[01;32mexample2.xyz\u001b[0m  \u001b[01;32mexample3.xyz\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "!ls -R --color exampleXYZs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can then simply generate the scripts for them by providing some arguments to the function `genAllScripts()`, we turned on the `verbose` argument to show what's happening under the hood:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating all job scripts for molecules under directories under ./exampleXYZs...\n",
      "\n",
      "Grouping molecules in ./exampleXYZs into individual directory...\n",
      "  Making directories for molecules in ./exampleXYZs...\n",
      "    Making directory for example1.xyz...\n",
      "      Made directory for example1.xyz!\n",
      "    Making directory for example2.xyz...\n",
      "      Made directory for example2.xyz!\n",
      "    Making directory for example3.xyz...\n",
      "      Made directory for example3.xyz!\n",
      "  DONE -- Made all directories!\n",
      "\n",
      "  Moving molecules in ./exampleXYZs into individual directory...\n",
      "    Processing example1.xyz...\n",
      "      Moved example1.xyz to example1!\n",
      "    Processing example2.xyz...\n",
      "      Moved example2.xyz to example2!\n",
      "    Processing example3.xyz...\n",
      "      Moved example3.xyz to example3!\n",
      "  DONE -- Moved all files!\n",
      "\n",
      "DONE -- Grouped all molecules!\n",
      "\n",
      "  Processing example1...\n",
      "    Generated Gaussian input file for example1!\n",
      "    Generated HPC job script for example1!\n",
      "  Processing example2...\n",
      "    Generated Gaussian input file for example2!\n",
      "    Generated HPC job script for example2!\n",
      "  Processing example3...\n",
      "    Generated Gaussian input file for example3!\n",
      "    Generated HPC job script for example3!\n",
      "DONE -- Generated all scripts!\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from qmmd import genAllScripts\n",
    "\n",
    "genAllScripts(inpDirPath, verbose=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that the scripts are now generated by listing `inpDirPath` recursively:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "exampleXYZs:\n",
      "\u001b[0m\u001b[34;42mexample1\u001b[0m  \u001b[34;42mexample2\u001b[0m  \u001b[34;42mexample3\u001b[0m\n",
      "\n",
      "exampleXYZs/example1:\n",
      "\u001b[01;32mexample1.inp\u001b[0m  \u001b[01;32mexample1.sh\u001b[0m  \u001b[01;32mexample1.xyz\u001b[0m\n",
      "\n",
      "exampleXYZs/example2:\n",
      "\u001b[01;32mexample2.inp\u001b[0m  \u001b[01;32mexample2.sh\u001b[0m  \u001b[01;32mexample2.xyz\u001b[0m\n",
      "\n",
      "exampleXYZs/example3:\n",
      "\u001b[01;32mexample3.inp\u001b[0m  \u001b[01;32mexample3.sh\u001b[0m  \u001b[01;32mexample3.xyz\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "!ls -R --color exampleXYZs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now `genAllScripts()` take in a lot of parameters, most of which has default values, so if you don't specify them the default values will be used. Note that if you have provided the `keywordLine` argument, a few of the others will be overridden (such as `solvent`, `solventModels`), because they've already been specified in your `keywordLine` argument."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on function genAllScripts in module qmmd.qmcalc.genScripts:\n",
      "\n",
      "genAllScripts(inpDirPath, keywordLine=None, method='m062x', basisSet='6-311+g(d,p)', solvent='water', solventModel='cpcm', mem=4000, ncpus=8, calcType='GOVF', charge=0, spin=1, scheduler='pbs', cluster='gadi', walltime='24:00:00', vmem=8000, jobfs=9000, project='p39', software='g16', version='c01', verbose=False)\n",
      "    Generate Gaussian input job files and submission files for molecules under all directories under a specified directory ('inpDirPath').\n",
      "    \n",
      "    Parameters\n",
      "    ----------\n",
      "    inpDirPath : str\n",
      "        Directory path to the input directories.\n",
      "    keywordLine : Union[str,None]\n",
      "        The line of keywords specification for Gaussian job, the other input arguments will be used to compose the line if it is not provided.\n",
      "    method : str\n",
      "        Keyword for DFT method specification in Gaussian.\n",
      "    basisSet : str\n",
      "        Keyword for basis set specification in Gaussian.\n",
      "    solvent : str\n",
      "        Keyword for solvent specification in Gaussian.\n",
      "    solventModel : str\n",
      "        Keyword for SCRF method specification in Gaussian.\n",
      "    mem : Union[int,str]\n",
      "        Amount of memory to request for the Gaussian job.\n",
      "    ncpus : Union[int,str]\n",
      "        Number of CPUs to request for the job.\n",
      "    calcType : str\n",
      "        Type of calculation (e.g. 'GOVF' for normal geometry optimisation; 'TSGOVF' for transition state geometry optimisation, \n",
      "        'SPEiS' for single point energy calculation, refer to 'keywordDict' for other options).\n",
      "    charge : int\n",
      "        Charge of the molecule (pay special attention if you have a transition state).\n",
      "    spin : int\n",
      "        Spin of the molecule.\n",
      "    scheduler : str\n",
      "        Scheduler to submit the job to.\n",
      "    cluster : {'gadi', 'uq-rcc'}\n",
      "        Cluster to run the job on.\n",
      "    walltime : str\n",
      "        Wall time to request for the job.\n",
      "    vmem : Union[int,str]\n",
      "        Amount of memory to request for the HPC job.\n",
      "    jobfs : Union[int,str]\n",
      "        Amount of Jobfs memory to request for the job.\n",
      "    software : str\n",
      "        Gaussian software name to use for the job.\n",
      "    version : str\n",
      "        Version of the software.\n",
      "    verbose : bool\n",
      "        Whether to display details of the process.\n",
      "    \n",
      "    Notes\n",
      "    -----\n",
      "    - Users should organise their directories such that a directory is created for each molecule to be calculated, and all of these directories should be placed under the specified directory that this function takes in ('inpDirPath')\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(genAllScripts)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below is an example of specifying the keywords through the `keywordLine` argument:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating all job scripts for molecules under directories under ./exampleXYZs...\n",
      "  Processing example1...\n",
      "    Generated Gaussian input file for example1!\n",
      "    Generated HPC job script for example1!\n",
      "  Processing example2...\n",
      "    Generated Gaussian input file for example2!\n",
      "    Generated HPC job script for example2!\n",
      "  Processing example3...\n",
      "    Generated Gaussian input file for example3!\n",
      "    Generated HPC job script for example3!\n",
      "DONE -- Generated all scripts!\n",
      "\n"
     ]
    }
   ],
   "source": [
    "keywordLine = '# m062x/6-311+g(d,p) opt=calcfc freq scrf=(cpcm,solvent=water) int(grid=ultrafine)'\n",
    "genAllScripts(inpDirPath, keywordLine, verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Molecular Dynamics Simulations\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}