I like thinking about workflow - I don’t think people talk about it enough. So in an effort to discuss and improve my own workflow, below is a description of tools I’m actively using. If you have any suggestions, please let me know!
I also do a lot of work in Python. I used to work in Spyder, mainly because it was a smooth transition from R Studio. Ultimately though I’ve landed on a combination of Atom and Jupyter notebooks. I like Jupyter because I can set it up using ssh, so I can work on large datasets and not crash my computer.
I use Git and GitHub as version control via Git bash.
For making things fast (enough), I typically rely on the wizardy of Numba.
I use Mendeley as a reference manager, but Zotero may be in the cards soon. I mostly write with LateX, and so I export the bibtex file from Mendeley, but Mendeley does most of the organizing information etc.
My favourite way to write is using a Google Doc. I write the text in LateX (as if it were to appear in a LaTeX editor), but edit it as a normal google doc. This makes revisions easy with co-authors. Only until the final stages do I switch it over to compiling in LaTeX. I use TeXWorks for editing and compiling just because it’s easy, and export the BiBtex document from Mendeley.
I like to write things down, and for that I use Leuchtturm. Or whiteboards. I love whiteboards and would prefer to have them on all walls.
For posters and presentations I use powerpoint - I know there are alternatives, but these have been sufficient so far! Suggestions very welcome.
I organize all my projects the same way. This is to improve reproducibility, but also it just helps me find things down the road.
I have a ‘Projects’ folder that is synced with Google Drive, so everything is backed up as I work (for large datasets I typically back them up separately so they don’t take up too much space). Within each project, I have consistent file structure: ‘/data’, ‘/scripts’, ‘/writing’, ‘/tables’, ‘/figures’, ‘/meetings’. Within ‘/data’, I sometimes put ‘/intermediate_data’, when an analysis takes a long time to reproduce. Within each folder I typically put a ‘/deprecated’ subfolder, as I am a bit of a hoarder. I also really try to put important information (i.e. conda environment names etc.) in the main folder – it’s easy and annoying to forget these things after long pauses and this helps me get back to speed!