# Posts by Tags

## Conditional execution in bash

Published:

Situation: You want to execute a script as soon as a particular file is created.

This is an easy one-liner:

# waiter.sh
# Usage: ./waiter.sh script_to_be_run file_to_anticipate [time_delay, def. 10 mins]
while [ ! -f $2 ]; do sleep${3-10m}; done; bash $1  The while loop keeps checking the existence of the specified file: while the file does not exist the loop will run sleep on and on. When the file exist, loop will be exited and the specified script will be run. Could it get any simpler? It seems like a waste just to continually check, but I don’t see a better option for now. (If you are wu liao run bash -x waiter.sh to see how many sleeps your waiter.sh has to go through to finally run your desired script.) One learning point – I was using this kind of thing before to make a positional argument optional: [ "$3" = "" ] && time_delay="15m" || time_delay=$3  This is a “ternary” bash construction, which you can construe as a shorthand conditional: A && B || C means “if A is true; execute B; else execute C”. Turns out the easier way is using shell parameter expansion magic: time_delay=${3-15m}  # if $3 is unset, set$3 to 15m
time_delay=${3:-15m} # if$3 is unset or null, set \$3 to 15m


## Extracting sequence from PDB file

Published:

There are times that you need the sequence of only the resolved amino acids in an X-ray crystal structure, not the full sequence of the construct. If you download the FASTA sequence or check the SEQRES record in the PDB file, you would only find the full sequence.

## How to persistently run your script with Bash

Published:

This is a simple script that saves a lot of headaches for me. The situation: I’m running a proprietary software and it seems that there is a problem with the way our license server is setup, because from time to time the process will stop due to “not enough license” even though we have enough. Then we have to rerun, which is ok because there are checkpoint files, but it is a pain to constantly check whether my run has crashed or not.

## Plotting business: Automated linear regression with Grace

Published:

What software do you use to plot?

## Formatting list of SMILES with bash scripting

Published:

Here is a fun text manipulation exercise using bash you can do in less than an hour. Given that I have this text file, file.smi:

smiles1 some_id_abc
smiles2 some_id_xyz
...


I want to have it like this:

smiles1 C00000001
smiles2 C00000002
...


This was my real-world need of converting a list of SMILES to a format that is accepted by a conversion programme. Looks easy right?

## Simple web scraping with Python

Published:

The situation: I wanted to extract chemical identifiers of a set of ~350 chemicals offered by a vendor to compare it to another list. Unfortunately, there is no catalog that neatly tabulates this information, but there is a product catalog pdf that has the list of product numbers. The detailed information of each product (including the chemical identifier) can be found in the vendor’s website like this: vendor.com/product/[product_no]. Let me show you how to solve this problem with bash and Python.

## Good practice for bash scripting

Published:

I will illustrate some good practices of writing bash scripts by showing you how I write and refactor my bash script dock.sh that does some preparation for docking and then launch docking.

## Doing t-test in batch

Published:

Situation: I want to do t-test between two sets of experiment A and B. Each A and B were run in triplicates, so we have:

data_A_1.dat
data_A_2.dat
data_A_3.dat
data_B_1.dat
data_B_2.dat
data_B_3.dat


How to run t-test between every combination of A and B, as well as all A combined vs all B combined?

## Plotting business: Automated linear regression with Grace

Published:

What software do you use to plot?

## Formatting list of SMILES with bash scripting

Published:

Here is a fun text manipulation exercise using bash you can do in less than an hour. Given that I have this text file, file.smi:

smiles1 some_id_abc
smiles2 some_id_xyz
...


I want to have it like this:

smiles1 C00000001
smiles2 C00000002
...


This was my real-world need of converting a list of SMILES to a format that is accepted by a conversion programme. Looks easy right?

## Good practice for bash scripting

Published:

I will illustrate some good practices of writing bash scripts by showing you how I write and refactor my bash script dock.sh that does some preparation for docking and then launch docking.

## Picking up Python as a scientist

Published:

My PhD supervisor prophetically decreed back in 2013 that “it might do thee some good to learn Python” (not his exact words).

## Simple web scraping with Python

Published:

The situation: I wanted to extract chemical identifiers of a set of ~350 chemicals offered by a vendor to compare it to another list. Unfortunately, there is no catalog that neatly tabulates this information, but there is a product catalog pdf that has the list of product numbers. The detailed information of each product (including the chemical identifier) can be found in the vendor’s website like this: vendor.com/product/[product_no]. Let me show you how to solve this problem with bash and Python.

## Structure leads to function

Published:

Structure leads to function is a fundamental tenet of structural biology. It may sound obvious (or not), but the implications are profound.

## Job title

Published:

What science field label do you give yourself?

## Why do simulation?

Published:

Why do we do what we do? — is a question that should be asked by practitioners of any vocation

## Regarding that lab notebook… (Part 2: Is there a better way?)

Published:

Searching, searching, for a better receptacle.

## Regarding that lab notebook… (Part 1: Lab notebooking and me)

Published:

‘Tis the receptacle of our tinkering of the world, out of which we distill and crystallise knowledge.

## Automated area under the curve (AUC) calculation with Grace

less than 1 minute read

Published:

I wish I have an elegant solution for this like automated linear regression, but I don’t. This solution that I came up with is hacky, but it works.

## Plotting business: Automated linear regression with Grace

Published:

What software do you use to plot?

## Molecular similarity network with visualised structures

less than 1 minute read

Published:

Analogous to a Sequence Similarity Network (SSN), molecular similarity network visualises Tanimoto similarity between molecules. iwatobi shows how to construct one with molecular structures here and here using Python packages RDKit, networkx, and cyjupyter.

## Regarding that lab notebook… (Part 2: Is there a better way?)

Published:

Searching, searching, for a better receptacle.

## Regarding that lab notebook… (Part 1: Lab notebooking and me)

Published:

‘Tis the receptacle of our tinkering of the world, out of which we distill and crystallise knowledge.

## Link roundup: Oct 2019

less than 1 minute read

Published:

## Link roundup: Sep 2019

less than 1 minute read

Published:

## Link roundup: Aug 2019

less than 1 minute read

Published:

## Link roundup: Jul 2019

less than 1 minute read

Published:

## Link roundup: Jun 2019

Published:

For those in the crowd who aren’t synthetic chemists, it cannot be emphasized enough that there is no way that you could have known about either of these choices beforehand – and that if you change the reaction to another bromo-heterocyclic system, the optimal base and solvent are likely to switch again to something else entirely. It’s as bad as cell culture or X-ray crystal growing, two other areas that are famously infested with evil spirits and voodoo rituals. You run into these systems that are just intrinsically very sensitive to initial conditions, with variables that are sometimes too small or obscure for you to even realize that they’re variables.

Published:

## Link roundup: Apr 2019

less than 1 minute read

Published:

## Link roundup: Mar 2019

Published:

I would make a distinction between temporary and certified assholes, because all of us under the wrong conditions can be temporary assholes. I’m talking about somebody who is consistently this way, who consistently treats other people this way. I think it’s more complicated than simply saying an asshole is someone who doesn’t care about other people. In fact, some of them really do care — they want to make you feel hurt and upset, they take pleasure in it.

… See, just get the government out of the way and everything starts to flourish! But as I’ve said many times, the FDA is not the real roadblock in this business. It’s biology. More specifically, it’s our lack of understanding of biology. Lowering standards will do nothing to help that at all.

Papers
ACS Chem Bio | Esterification Delivers a Functional Enzyme into a Human Cell
Q Rev Biophys | Frustration in Biomolecules 2014 paper, but I only read it recently. Besides providing a nice mental model of frustration in thinking about protein physics, the text is written in decidedly unacademic prose.
Nature | Complete biosynthesis of cannabinoids and their unnatural analogues in yeast
PLoS Comp Bio | Script of scripts: A pragmatic workflow system for daily computational research
This is like Makefile, but more user-friendly, you can switch languages easily, and is hosted in Jupyter notebook. If you constantly switch between bash, Python, R, and others; this is good to keep your workflow in one file. I think I don’t need it at the moment.
Others
Choosing our religion

Creation is thus seen as a relationship of radical dependence. To cite an insight deriving from St Thomas Aquinas, God’s creation of the world should not be likened to a carpenter making a chest. A better analogy would be more intimate – a singer producing a song, for instance. The difference is profound. Carpenter and chest are discrete entities. Carpenters can pass on the articles they make, never seeing them again. But a song is by definition an emanation of a singer.

A crucial difference between humans and other animals is equally plain. Genuinely fulfilled human lives involve further dimensions including dignity, which is connected with the exercise of choice; and virtue, implying the need to stretch or transcend ourselves. G. K. Chesterton wrote that it makes little sense to upbraid a lion for not being properly lion-like: lions are lions. The same is not true of human beings. People everywhere have a striking idea that they ought to behave in certain “humane” ways, but also an awareness that they do not in fact behave as they should. It is often noted that these two facts are the root of all clear thinking about ourselves and our world.

Did You Know Pandas Can Do So Much?
I find these Ruben Bolling’s illustrations amusing: 1 | 2
GQ | The Secrets of the World’s Greatest Art Thief
What We Owe a Rabbit
Regarding the Em Dash
A history of Singapore in 10 dishes

## Link roundup: Feb 2019

Published:

Our double task is now to preserve and foster both biological evolution as Nature designed it and cultural evolution as we invented it, trying to achieve the benefits of both, and exercising a wise restraint to limit the damage when they come into conflict. With biological evolution, we should continue playing the risky game that nature taught us to play. With cultural evolution, we should use our unique gifts of language and art and science to understand each other, and finally achieve a human society that is manageable if not always peaceful, with wildlife that is endlessly creative if not always permanent.

Nature | The ten commandments for learning how to code
The Atlantic | Scientists Are Totally Rethinking Animal Cognition
Nautilus | How the Biggest Fabricator in Science Got Caught
On one hand, it is good to pre-screen data with a statistical tool, but on the other, is it that hard to generate data that appears to be experimentally derived? I can imagine, instead of coming up with the numbers oneself, one could sample from a random number generator with specified distribution. How would one detect this sort of fraud?
Nautius | Why Misinformation Is About Who You Trust, Not What You Think
NY Times | Why Do South Asians Have Such High Rates of Heart Disease?

Studies show that at a normal body weight — generally considered a body mass index, or B.M.I., below 25 — people of any Asian ancestry, including those who are Chinese, Filipino and Japanese, have a greater likelihood of carrying this dangerous type of fat.

Eat Meat. Not Too Much. Mostly Monogastrics.

…a diet including chicken and pork, but no dairy or beef, has lower greenhouse gas emissions than a vegetarian diet that includes milk and cheese, and almost gets within spitting distance of a vegan diet.

All LSTABs face a dilemma. Politicians generally prefer direct answers to their questions. In other words, they want policy recommendations. They have been known to ask for ‘one-handed scientists’, so that they don’t have to hear ‘on the other hand’.

Borgwardt says the Korean study shows that tea has “a relatively strong effect”, on a par with that of 2.5 hours of exercise per week. Epidemiological studies suggest that long-term habitual consumption of green tea might reduce the risk of dementia. One study of people aged over 55 in Singapore, for example, found that those who drank as little as one cup of tea per week performed better at memory and information-processing tasks than did non-tea-drinkers2.

However, nearly two decades after the first predictions of dramatic success, we find no impact of the human genome project on the population’s life expectancy or any other public health measure, notwithstanding the vast resources that have been directed at genomics. Exaggerated expectations of how large an impact on disease would be found for genes have been paralleled by unrealistic timelines for success, yet the promotion of precision medicine continues unabated

“Twelve years until we all die” is catchier than “under some reasonable but debatable assumptions about economic growth, policy choices, and the physical climate sensitivity, the carbon budget to stay below the arbitrary threshold of a 1.5 degree C temperature increase relative to pre-industrial conditions appears to be exceeded by 2030”.

## Link roundup: Jan 2019

Published:

Science
Nature | Can quantum ideas explain chemistry’s greatest icon?
Quizzes from Harvard’s The Music Lab If you have always suspected that you are tone deaf, have I got a quiz for you.
Nature | Extreme chemistry: experiments at the edge of the periodic table

This is somewhat as expected: the strength of chemical bonding tends to decrease down a periodic group, as atoms get larger. But to fully explain superheavies’ chemistry, Pershina’s calculations must also take into account relativistic effects. In very heavy atoms, which have super-strong interactions between the innermost electrons and the highly charged nuclei, the electrons are travelling so fast (potentially at more than 80% of the speed of light) that their mass increases, as special relativity predicts. This pulls them farther in towards the nucleus, which can mean that they screen the outer electrons from the nuclear charge more effectively. That alters the outer electrons’ energies and, consequently, their chemical reactivity.

Nature | Flying squirrels are secretly pink
Harvard Business Review | Time for Happiness
In the Pipeline | Exercise And Its Signaling
In the Pipeline | Quinine’s Target Quinine’s target has been identified. This is significant step in malaria research.
BBC | A bit of meat, a lot of veg - the flexitarian diet to feed 10bn
The Smithsonian | The Statistician Who Debunked Sexist Myths About Skull Size and Intelligence
PopSci | Saturn is ancient, but its rings are only as old as the dinosaurs
Nature | Watch: Robot reveals how ancient reptile ancestor moved It’s cool that they even make interactive demo
Nature | Cryptic DNA sequences may help cells survive starvation
In the Pipeline | Nivien’s Shot Drug research is hard
In the Pipeline | Come One, Come All to These Kinases.
Is Sunscreen the New Margarine?
Nature | Designer protein delivers signal of choice Layman summary of a new paper
The Conversation | The periodic table is 150 – but it could have looked very different
Science | Four lessons about transitioning from academia to the ‘real world’
Brain Pickings | Love After Life: Nobel-Winning Physicist Richard Feynman’s Extraordinary Letter to His Departed Wife
The Guardian | Why exercise alone won’t save us
The Scientist | Can Viruses in the Genome Cause Disease?
The Atlantic | Why Exaggeration Jokes Work
Wired | How a Reclusive Lizard Became a Prize Find for Wildlife Smugglers
3 Quarks Daily | The vast and mysterious real numbers
Paper
PLOS Comp Bio | Inherent versus induced protein flexibility: Comparisons within and between apo and holo structures
Science | An enantioconvergent halogenophilic nucleophilic substitution (SN2X) reaction
Science | Porphyromonas gingivalis in Alzheimer’s disease brains: Evidence for disease causation and treatment with small-molecule inhibitors
PLOS Comp Biol | Ten simple rules on how to create open access and reproducible molecular simulations of biological systems
ACS Catalysis | Structure-Guided Triple-Code Saturation Mutagenesis: Efficient Tuning of the Stereoselectivity of an Epoxide Hydrolase
Cell Chem Bio | What Makes a Kinase Promiscuous for Inhibitors?
Nature | Enzymatic assembly of carbon–carbon bonds via iron-catalysed sp3 C–H functionalization
It’s nice to see that even though Frances Arnold just received the Nobel Prize, she is still actively publishing papers. Or maybe I notice just because I am on the lookout for publications on protein design ¯\_(ツ)_/¯
PNAS | Dietary sugar silences a colonization factor in a mammalian gut symbiont [via Scientific American]
Others
What Was It About Animorphs? I never finished reading Animorphs when I was a teenager because the library didn’t have the complete collection. Reread everything last year and glad that I did.
Language Log | Slavs and slaves
NYT | Virginia Woolf? Snob! Richard Wright? Sexist! Dostoyevsky? Anti-Semite!

I think we’d all be better readers if we realized that it isn’t the writer who’s the time traveler. It’s the reader. When we pick up an old novel, we’re not bringing the novelist into our world and deciding whether he or she is enlightened enough to belong here; we’re journeying into the novelist’s world and taking a look around.

As Tintin turns 90, the comic book hero is still teaching children about the world
I have a similar anecdote with the author about the informativity of comic books. Though it was with Doraemon, of all things.
Language Log | Sinographs for “tea” Fascinating. Adding The True History of Tea to my to-read.
Buzzfeed News | How Millennials Became The Burnout Generation
The Next Level of Data Visualization in Python

Published:

Published:

## Science in the age of machine learning

less than 1 minute read

Published:

This C&EN article <Is machine learning overhyped?> is great in its candour and depth about how machine learning is affecting chemistry. As chemistry is the central science, some points also definitely apply to science at large.

## Book notes: On not speaking Chinese

Published:

This book is quite different from my usual reading, which is mostly fiction, science-fiction, and popular science. This book is categorically a humanities book. In fact, some chapters were previously published in humanities journal. Nevertheless, I persevered through, since the subject is close to my heart (and my identity). Also, Ien Ang’s writing is not boring academic prose. It is certainly not easy reading, but it is very readable and her personal stake in the subject also helps. The book is an academic, yet also personal, essay.

## Swedish loanwords and pronunciation woes

Published:

I’m reading Oliver Sacks’ Uncle Tungsten and was delighted to find out that tungsten is a Swedish loanword (lit. heavy stone). That got me revisited my list of Swedish loanwords we have in English, as well as some names that have become scientific terms.

## Tomas Tranströmer and many-worlds interpretation

Published:

Finding science in literature

## Revisiting Gödel

Published:

For years after I first encountered Gödel’s incompleteness theorems, they had utterly baffled me (just look at the Wikipedia page and tell me if it is comprehensible). So I dismissed it as one of those inexplicable-to-non-mathematicians things.

## Rainbow connection

Published:

My most impressionable memory of seeing a rainbow happened in a tourist bus in Iceland. It was faint, like a watercolour painting on the canvas that was the pale blue sky and the treeless Icelandic landscape. I noticed it first, and a few minutes later the whole bus was teetering at one side of the bus, peering spellbound out of the window (except for the bus driver, I hope).

## Automated form-filling with Python

Published:

Situation: You need to submit a lot of stuff to a website, but it only provides individual submission, not batch.

## Molecular similarity network with visualised structures

less than 1 minute read

Published:

Analogous to a Sequence Similarity Network (SSN), molecular similarity network visualises Tanimoto similarity between molecules. iwatobi shows how to construct one with molecular structures here and here using Python packages RDKit, networkx, and cyjupyter.

## Doing t-test in batch

Published:

Situation: I want to do t-test between two sets of experiment A and B. Each A and B were run in triplicates, so we have:

data_A_1.dat
data_A_2.dat
data_A_3.dat
data_B_1.dat
data_B_2.dat
data_B_3.dat


How to run t-test between every combination of A and B, as well as all A combined vs all B combined?

## Simple web scraping with Python

Published:

The situation: I wanted to extract chemical identifiers of a set of ~350 chemicals offered by a vendor to compare it to another list. Unfortunately, there is no catalog that neatly tabulates this information, but there is a product catalog pdf that has the list of product numbers. The detailed information of each product (including the chemical identifier) can be found in the vendor’s website like this: vendor.com/product/[product_no]. Let me show you how to solve this problem with bash and Python.

## Picking up Python as a scientist

Published:

My PhD supervisor prophetically decreed back in 2013 that “it might do thee some good to learn Python” (not his exact words).

Published:

## The Two Cultures | CP Snow

The stuff Snow was talking about in this lecture feels antiquated now. Probably it was the peculiar time and place that polarised the “two cultures”. Or perhaps it was Snow’s caricaturisation. (Snow himself looked retrospectively at his 1959 lecture in Part II.)

Nevertheless we can always learn something – some of the points are interesting and and applicable. The divide of the current age though is between scientists and non-scientists – hopefully it is something we will overcome. The introduction by Stefan Collini also proves to be a commentary that is also worth reading.

Published:

## The Two Cultures | CP Snow

The stuff Snow was talking about in this lecture feels antiquated now. Probably it was the peculiar time and place that polarised the “two cultures”. Or perhaps it was Snow’s caricaturisation. (Snow himself looked retrospectively at his 1959 lecture in Part II.)

Nevertheless we can always learn something – some of the points are interesting and and applicable. The divide of the current age though is between scientists and non-scientists – hopefully it is something we will overcome. The introduction by Stefan Collini also proves to be a commentary that is also worth reading.

## Muscle memory

Published:

I have been teaching swimming to a friend who is relatively a beginner. I often found myself having no words to describe a certain motion or posture and go “you just have to experience it with your body”. While I do attribute this partly to my lack of training as a swimming instructor, I still think there are things that cannot be conveyed in words, especially when it comes to motoric stuff, don’t you think?

Martial art: Definitely a motoric stuff
Image credit: SMBC Comics

Anyway, thinking about this my mind wandered to the talk about failures in science. It’s like this – before I did my PhD I have of course heard things about the inevitable stress and failures (and I find myself saying the same thing to pre- and current PhD students too) but knowing it cognitively is different from experiencing it oneself, isn’t it?

Here is a recent Nature article on the subject: <Scientific progress is built on failure>. Reading this, on one hand I hope that a PhD student or a prospective one takes note and does the necessary mental preparation; but on the other hand, I cannot help but go “My son, I know these things would fall on deaf ears. You just have to experience it for yourself”.

## Science in the age of machine learning

less than 1 minute read

Published:

This C&EN article <Is machine learning overhyped?> is great in its candour and depth about how machine learning is affecting chemistry. As chemistry is the central science, some points also definitely apply to science at large.