Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with absexp distance prior and step prior #797

Open
christinawlindberg opened this issue Feb 7, 2024 · 10 comments
Open

Issues with absexp distance prior and step prior #797

christinawlindberg opened this issue Feb 7, 2024 · 10 comments
Labels

Comments

@christinawlindberg
Copy link
Contributor

I noticed a couple of issues when trying to use the new priors in the BEAST:

1. Absexponential Prior

When using the absolute exponential absexponential prior for distances, there was an issue updating the new weights to the existing weight values in the astropy table. As a result, rather than following an exponential distribution, the priors remained flat.

I think this is due to the fact that we were only updating the weights to a subset of indexes in the table, rather than updating the entire column at once. One way to fix that is to set up temporary lists which are then assigned as the new column values at the end.

# replacement for lines 100-109 in grid_and_prior_weights.py
temp_grid_weight = np.zeros_like(_tgrid["grid_weight"])
temp_prior_weight = np.zeros_like(_tgrid["prior_weight"])
temp_weight = np.zeros_like(_tgrid["weight"])


for i, dist_val in enumerate(uniq_dists):
    # get the grid for this distance
    (dindxs,) = np.where(_tgrid["distance"] == dist_val)
    temp_grid_weight[dindxs] = (
        dist_grid_weights[i] * total_dist_grid_weight[i]
    )
    temp_prior_weight[dindxs] = (
        dist_prior_weights[i] * total_dist_prior_weight[i]
    )
    temp_weight[dindxs] = dist_weights[i] * total_dist_weight[i]

_tgrid["grid_weight"] *= temp_grid_weight
_tgrid["prior_weight"] *= temp_prior_weight
_tgrid["weight"] *= temp_weight

2. Step Prior

When I try to use the step prior for Av, the distance range is restricted to only the range in front of the cloud, i.e. dist0 (e.g. see plot 1). I think this is due to how the weights for the Av prior are calculated for each log normal.

image

For sources in front of the cloud, we can define the Av distribution as a lognormal centered on 0.1 with a sigma of 0.05. For sources behind the cloud, we can instead define the Av distribution as a lognormal centered on 3.1 with a sigma of 0.05.

For each lognormal, the total probability sum should be equal to one. This means that, for the lognormal in front of the cloud, there is a lot more probability mass right around 0.1 compared to the lognormal behind the cloud, where the total probability at 3.1 would be lower, (e.g. in plot 2).

As a result of this imbalance in the Av priors, all sources sampled for catalogs come from in front of the cloud, limiting the distance range to only the front.

One way to avoid this issue is to just sample from each lognormal individually, i.e. set up two settings files, each spanning half of the distance range. The only issue with this is that you'd need to adjust the number of stars to sample for each catalog manually if the distance distribution isn't symmetric around the cloud distance i.e. if the cloud is at 60 kpc, then an exponential distance prior should also be centered on 60 kpc and go out X kpc in each direction, e.g. 50-70 kpc.

image

@karllark
Copy link
Member

karllark commented Feb 7, 2024

Great issue description! Many thanks!

@karllark karllark added the bug label Feb 10, 2024
@karllark
Copy link
Member

I do not fully understand the issue raised in point 1. I think maybe a quick conversation in person would be useful. Maybe next week?

@karllark
Copy link
Member

For point 2. I am wondering if this is a grid sampling issue. Basically, the width of the log-normal when the log-normal center is small is quite small. Unlike the same log-normal with the center is large with the same log-normal width. As a result, I wonder if the dust in front of the cloud is getting little weight as it is only sampled at along the mid point. One way to test this would be run a grid with a higher A(V) in front of the cloud. Say 2. and then have the step be 2 or something like that to stay inside your grid. Then the front would be 2.0 and the back would be 5. Keep the log-normal width at 0.1 and see what the simulations then show.

@christinawlindberg
Copy link
Contributor Author

I do not fully understand the issue raised in point 1. I think maybe a quick conversation in person would be useful. Maybe next week?

Sorry, I did a bad job of explaining what the original issue was. Originally, the distance distribution would always end up being flat when I simulated a population regardless of what I specified in the settings file. For example, in the following plot, I had specified an absexponential distribution for the distance prior but the actual results ended up being flat.

I think this issue stemmed from an error in updating the weight values in the astropy table, like I mentioned in my initial post.

image

@christinawlindberg
Copy link
Contributor Author

For point 2. I am wondering if this is a grid sampling issue. Basically, the width of the log-normal when the log-normal center is small is quite small. Unlike the same log-normal with the center is large with the same log-normal width. As a result, I wonder if the dust in front of the cloud is getting little weight as it is only sampled at along the mid point. One way to test this would be run a grid with a higher A(V) in front of the cloud. Say 2. and then have the step be 2 or something like that to stay inside your grid. Then the front would be 2.0 and the back would be 5. Keep the log-normal width at 0.1 and see what the simulations then show.

I went ahead and ran a test with the following Av priors as you recommended and I think you're partially right that it's some sort of Av grinding issue.

av_prior_model = {"name": "step",
                  "dist0": 55e3 * units.pc,
                  "amp1": 2.0,
                  "damp2": 2.0,
                  "lgsigma1": 0.1,
                  "lgsigma2": 0.1} 

I had set the distance prior as an absexponential from 47-77 kpc with a peak at 62 kpc (tau=10 kpc), but as you can see in the results below, there is an abnormally large number of sources sampled from before the dust cloud, as seen in the lower left histogram.

Any ideas as to what could be causing this weird distribution in distance?

image

@karllark
Copy link
Member

Thanks for the update. Sounds like we should fix the distance issue then see what that does for the simulation issue. Do you want to put in a PR for issue 1?

@christinawlindberg
Copy link
Contributor Author

Hey Karl, I'd be happy to put in a PR for issue 1, but research is a little hectic at the moment, so I could first realistically get to this sometime next week.

@karllark
Copy link
Member

I can look into putting together a PR for issue 1. I do not understand why updating in place does not work.

For 2, I think fixing this would require allowing for a non-linear (e.g., log10) bin spacing for A(V). Definitely possible. Just need to put in the logic for specifying the bin spacing and then update the grid_weights appropriately.

@karllark
Copy link
Member

To help me out, can you send me your settings file?

@christinawlindberg
Copy link
Contributor Author

Here's a copy of the BEAST settings files I've been using to make my simulation: beast_settings_sims_upper.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants