Project Development & Roadmap
Initial Motivation
This project was created to add needed features to Python’s built-in round() function.
Namely:
the ability to consistently round floating point numbers despite float’s inherent lack of precision.
the ability to round a number by number of significant figures/digits instead of by decimal places only.
the ability to round a number by its associated uncertainty.
Additional features were also needed to format numeric values for scientific publications. Namely:
displaying a number with its associated uncertainty - in bracketed form - with a space between every thousand(th)s
Next Steps
The Ultimate goal of the project is to add the included rounding features to the Python standard library. This will require some refactoring among other tasks before the first pull request should be made. This work is outlined in the following sections:
Separate out formatting code
The code for formatting the resultant rounded number string does not belong in the standard library’s round() function but would make more sense as either it’s own package, as part of the numpy package (ie. the format_float_positional() function), or as part of another package involving numeric or data visualization.
This will have the added benefit of making sigfig’s code more readable which is never a bad thing.
Increase numeric storage efficiency and standardization
sigfig currently parses numbers by first converting to string and then storing in a {<ten’s power>:<numeric value>} dict (see _Number in source for technical details). While this guarantees bug-free functionality for all numbers and is suitable for numbers already stored as strings, this lacks efficiency for decimal.Decimal and float. Numeric values could possibly be stored instead using the same storage technique employed by the decimal.Decimal package (after an investigation of that technique to ensure full code coverage). This should fully satisfy the decimal.Decimal case whereas the float case can be handled as-is by default but allowed to optionally (with high_speed=True instead of the default high_accuracy=True) use floating point arithmetic when speed trumps accuracy.
Interface Overhaul
The current interface is multi-dimensional and very forgiving which allows for a wide range of allowable but unexpected behaviours instead of warning/crashing if a user strays from typical use cases. While this suit’s the project’s current scrappy state, a redesign of the interface-handling _arguments_parse() function is recommended before merging with the standard library.
Possible Features
Other features looking for implementation by any potential contributors are welcomed, would be greatly appreciated, are detailed below, and (subjectively) ordered by priority:
Baking and User-Defined Formats/Styles
The ability to “bake” default behaviour into round() (essentially partial application, like you might do with functools.partial()) could allow for many desirable customizations. Some examples are:
rounding by number of decimals instead of significant figures by default through something like
round.bake(round_by_decimals=True)always spacing numbers by 3 (in the case where output is of type
str) throughround.bake(spacing=3, spacer=' ')
Warnings, Alerts, and Feedback
Certain actions and usages of round() warrant feedback given to the user. These include (but are not limited to) the following:
warning for invalid keyword arguments
warning for depreciated usages
warning when out of range values are passed
informing when conflicting inputs are provided
informing when any data is passed implicitly instead of explicitly. For example:
round(3.2, 1)versusround(3.2, sigfigs=1)
Units, Formatted Numbers, and Unit Prefixes
Modification of the _num_parse() function can be made without much effort to allow for formatted numbers (ie. '1,237.0'), currency (ie. '$3,157.00'), or numeric data with units (ie. '3475.2753nm') to be accepted. This formatting data can be parsed and interpreted alongside the numeric data and the resulting output from the round() operation can be given (by default) in the same format as the input was given.
Also, common units with their prefixes can be parsed so that more suitable prefixes for units can be chosen or explicitly specified by a new keyword argument. For example:
>>> round('3475.2753nm', '45.9479nm')
'3.48 ± 0.05 μm'
>>> round('3475.2753nm', '45.9479nm', units='cm', sep='brackets')
'0.000348(5) cm'
Documentation: Figure(s) for Rounding Rules
The Uncertainty Rounding Rules section may be confusing to those unfamiliar with the concept and would benefit from visual aid. This can help to disambiguate like-sounding terms like “uncertainty”, “magnitude of uncertainty”, “number’s uncertainty”, and “error” as well as “number”, “given number”, and “number of decimals”.
Input Precision
Input precision is not currently stored. In cases where a number is rounded to more decimals than it was given (ie. round(1.23, 0.000073)) a warning can be thrown stating “implicit uncertainty (0.005) greater than provided uncertainty (0.000073). Provided uncertainty will be used.” since (in this case) the value 1.23 could be representing any value between 1.225 and 1.2349999….
Formatting of Exponentials
The exponentials resulting from scientific and engineering notation are separated from the number & uncertainty with an uppercase “E” present in both the number and resulting uncertainty. Some might find it useful to customize the character(s) and/or optionally only appended the character after the uncertainty and not after the number.
Parse Number Last (small efficiency increased)
A small gain to efficiency can be made by first parsing the uncertainty, number of decimals, or number of significant figures (aka the rounder) since these dictate how many digits are relevant in the given number. With the rounder known, the parsing of the given number can be quicker since digits beyond what the rounder dictates can be discarded. This will require a re-design of _num_parse() where the exponential information is parsed first and will only be of (limited) benefit when the number is given with exponential notation (unless it’s known to not have a trailing exponent).
Contribution Guide
Getting Started
Clone the repo and configure your environment:
$ git clone https://github.com/drakegroup/sigfig.git
$ cd sigfig
$ curl -sSL https://install.python-poetry.org | python3 -
Fork the repository on GitHub.
Clone your fork locally.
Create a new branch for your changes.
Make your changes and commit them to your branch.
Push your changes to your fork on GitHub.
Open a pull request to the master branch of the main repository.
Testing
All new features should be accompanied by tests. To run the tests and view the coverage report:
$ coverage run test/test.py
$ coverage report -m sigfig/sigfig.py
Style
sigfig was developed with a few PEP 20 idioms in mind:
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Readability counts.
Refer to PEP 8 and the Google Python Style Guide where possible… but no need to be dogmatic about it.
Documentation
To build the documentation locally:
$ cd doc
$ make html
Then open the file doc/_build/html/index.html in your browser.