-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Functionality to convert between R and Python #10
Conversation
- Modify `rspec_to_pyspec` to support `spectraVariableMapping`. - Add `spectraVariableMapping`. - Add unit tests for `rspec_to_pyspec`. - Configure with `basilisk` package.
This is related to issue #8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameterized tests would maybe make the testing cleaner, also I'm unsure about the reference
parameter - if it is really not needed it shouldn't be there, if it has to be there (even without being used) it should maybe be stated why.
DESCRIPTION
Outdated
Roxygen: list(markdown=TRUE) | ||
Encoding: UTF-8 | ||
StagedInstall: no | ||
SystemRequirements: python (>= 2.7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd stick with python larger 3.5 or so - not sure how well things work in python2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll update to 3.5 (or even newer?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 3.5 or 3.7 are reasonable choices
#' @importFrom basilisk BasiliskEnvironment | ||
matchms_env <- basilisk::BasiliskEnvironment( | ||
envname = "matchms_env", pkgname = "SpectriPy", | ||
packages = c("matchms==0.14.0"), | ||
channels = c("bioconda", "conda-forge") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep - the basilisk
people really did a great work to ensure reproducibility. Will make it much easier to the end user :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually - is matchms
version 0.14.0 OK or do you suggest another, more recent, version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a more recent one at the moment.
R/conversion.R
Outdated
rspec_to_pyspec <- function(x, mapping = spectraVariableMapping(), | ||
reference = import("matchms"), | ||
BPPARAM = SerialParam(), .check = TRUE) { | ||
if (.check && !is(x, "Spectra")) | ||
stop("'x' should be a Spectra object.") | ||
plist <- spectrapply(x, .single_rspec_to_pyspec, spectraVariables = mapping, | ||
BPPARAM = BPPARAM) | ||
r_to_py(unname(plist)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference
parameter is actually not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, reference
is not used here, but in .single_rspec_to_pyspec
. It contains the reference to the Python Spectrum object, needed for construction. The question is if we can set a global variable that contains this reference during package loading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I also wanted to drop it - but then I realized:
- this allows also to use
import("matchms", convert = FALSE)
to disable automatic conversion of python to R and vice versa, maybe not wrong to disable to speed up things? - Also, in a loop, you can define this
reference
once, and then in the loop re-use it. Speeds up things.
I would thus keep it (it's anyway for the advanced user).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a function that populates a global variable might be the best way?
I don't think that passing the reference to the imported matchms is a good thing - this implies you could use different versions of matchms in a single session etc., which is not very clean.
But we can figure out how to do this properly in another issue. See #13
R/conversion.R
Outdated
pyspec_to_rspec <- function(x, mapping = spectraVariableMapping(), | ||
reference = import("matchms"), | ||
BPPARAM = SerialParam(), .check = TRUE) { | ||
if (!is(x, "python.builtin.list")) | ||
stop("'x' is expected to be a Python list.") | ||
x <- py_to_r(x) | ||
if (.check && !all(vapply(x, function(z) | ||
is(z, "matchms.Spectrum.Spectrum"), logical(1)))) | ||
stop("'x' is expected to be a Python list of matchms Spectrum objects.") | ||
spectra_list <- bplapply(x, .single_pyspec_to_rspec, | ||
spectraVariables = mapping, BPPARAM = BPPARAM) | ||
do.call(concatenateSpectra, spectra_list) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, reference
parameter is not used -> does import("matchms")
by itself have any implications (initialize global variables) if it is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you say it - I think I forgot to pass this reference
down to the .single_pyspec_to_rspec
function! I will fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need to reference in reference
to the correct environment that was set up by basilisk
, I guess the name needs to be changed to matchms_env
then, right? so reference = import("matchms_env")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget what I said... stupid stuff. The enviroment is loaded at a different point. It's the import of the package...
R/conversion.R
Outdated
.single_pyspec_to_rspec <- function(x, | ||
spectraVariables = spectraVariableMapping(), | ||
reference = import("matchms")) { | ||
plist <- x$metadata | ||
vars <- spectraVariables[spectraVariables %in% names(plist)] | ||
if (length(vars)) { | ||
rlist <- lapply(vars, function(z) plist[z]) | ||
## Drop NULL variables. | ||
spd <- DataFrame(rlist[lengths(rlist) > 0]) | ||
if (!nrow(spd)) | ||
spd <- DataFrame(msLevel = NA_integer_) | ||
} else | ||
spd <- DataFrame(msLevel = NA_integer_) | ||
spd$mz <- NumericList(as.numeric(x$peaks$mz), compress = FALSE) | ||
spd$intensity <- NumericList(as.numeric(x$peaks$intensities), | ||
compress = FALSE) | ||
Spectra(spd) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - reference
is not reference explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! I will drop it here. Think this must have a copy-paste error. Or is there any reason this should be here @michaelwitting ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it - did not break any unit tests, so should be fine.
tests/testthat/test_conversion.R
Outdated
p <- SpectriPy:::.single_rspec_to_pyspec(sps[1L]) | ||
res <- SpectriPy:::.single_pyspec_to_rspec(p, vars) | ||
expect_equal(mz(res), mz(sps[1L])) | ||
expect_equal(intensity(res), intensity(sps[1L])) | ||
expect_equal(rtime(res), rtime(sps[1L])) | ||
expect_equal(msLevel(res), msLevel(sps[1L])) | ||
|
||
p <- SpectriPy:::.single_rspec_to_pyspec(sps[2L]) | ||
res <- SpectriPy:::.single_pyspec_to_rspec(p, vars) | ||
expect_equal(mz(res), mz(sps[2L])) | ||
expect_equal(intensity(res), intensity(sps[2L])) | ||
expect_equal(rtime(res), rtime(sps[2L])) | ||
expect_equal(msLevel(res), msLevel(sps[2L])) | ||
|
||
p <- SpectriPy:::.single_rspec_to_pyspec(sps[3L]) | ||
res <- SpectriPy:::.single_pyspec_to_rspec(p, vars) | ||
expect_equal(mz(res), mz(sps[3L])) | ||
expect_equal(intensity(res), intensity(sps[3L])) | ||
expect_equal(rtime(res), rtime(sps[3L])) | ||
expect_equal(msLevel(res), msLevel(sps[3L])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be solved nicely using the parameterized test cases from the patrick
R package I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never used the patrick
package thus far - I'll have a look at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the suggestion!!!
I've addressed all your points @hechth - looking forward to your review. |
This should also close #10, right? |
Thanks for the review @hechth ! I will also try to set up proper github action tests in the main branch. |
Do you have experience with the CI for R packages? And do we want to integrate the package into conda? |
Yes, for our R-packages we use github actions that check the package on Windows, macOS and Linux. Integration into conda? I have no idea, never done that :) if you think that might be good we should do it, but how about all the dependencies then? are they already in conda? |
I can check the state of the dependencies for conda. |
This PR based on the work of @michaelwitting and:
basilisk
package to ensure the correct Python environment is available to the package (will be automatically installed and bundled within the package - i.e. is independent on any system Python packages).