-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offline/Online LP for CIM scenario 📚 #359
Comments
Hi @riccardopoiani , thank you for your attention to our work. Currently, we don't have the plan to transfer all these methods used in the paper to MARO platform. If you have some questions about the implementation, we can have a discussion. Thanks again! |
Hi @Jinyu-W, how about the methods proposed in these papers. Does the current MARO contains the implementation of those? In "Cooperative Policy Learning with Pre-trained Heterogeneous Observation Representations paper, it is written that source code is attached in 'code' folder but I couldn't find it. Can you help me, please. Thank you very much :) |
@wesley-stone could you provide more details about the methods in this paper? |
Hi @Jinyu-W, thank you for the response. I would be grateful to @wesley-stone for sharing code and related info. Many thanks again! :) |
Hi @Jinyu-W thanks for your response and availability. So, first of all, I have seen that there is an open pull request to adjust the topology for CIM and I was wondering if this was the cause. However, no issue describes the change made there. So, I would start from here, and then maybe we can build a discussion on how to implement correctly the method. Also, have you got any method that is able to reach high performances with the current version of the simulator? (any sort of method) |
Hi @riccardopoiani, I could share more information about the LP methods we used before.
We may open source a version of Online LP in the future, but it will not happen immediately, it will take some time. |
More information about the paper, the author @wesley-stone will have more say. :) |
@Jinyu-W thanks for you feedback. I will try for sure to integrate the safety reward factor. Moreover, it seems to me that the crucial point to have good performances (when feeding the plan to the simulator) is that you need estimates that are as accurate as possible (otherwise the plan will soon diverge from reality). |
Also, @Ahmedest61 I have seen something similar to what you are looking for in |
Ah, thank you so much @riccardopoiani. I'll have a a look. Just wondering, is there any difference in terms 'gnn-ac' (or any other methodology) used 'temp_gnn' and 'maro-0.2_cim_topology_adjustment' branches? If not, I assume 'maro-0.2_cim_topology_adjustment' is the right, more stable one? Thank you! |
No idea on this @Ahmedest61. I am currently working on Edit: I had a fast look at it, and the pre-training phase seems to be missing to me, so idk |
Yup, I think this branch "temp_gnn" is similar to what paper 2 proposed but again you're right about PreLAC part! |
Hi, as far as I know, the LP methods used in paper 1 and paper 2 are not the same. The one used in paper 2 is more closer to the formulation I have done before. So I could say more about it. As you mentioned above, the accuracy of the prediction can affect the final performance a lot, to avoid the prediction error being magnified over time. I have added two more "time window" to the LP solution formulation:
Wish these may help you. |
Thanks for the hints, but unfortunately, I am already doing both things. I tried many combinations of these hyperparameters but they not seem to solve my issue. It seems to be something that is more related to the estimation of the laden containers.
Am I missing something relevant here? Is this the way in which you estimate laden (and thus the supply and the vessel capacity for empty containers)? Moreover, what are the most relevant changes between paper 1 and paper 2 that you are aware of? |
I also noticed here that you were mentioning |
According to our experience before, the port capacity will not be the limitation for the valid action scope, and we also set a big enough capacity for the ports in those topologies. So you the absence of the checking of the port capacity wouldn't make any difference. But I agree that it's better to add it back. |
Yes, you are right :) |
Any answer for this? |
Hi @wesley-stone, could any of you provide the environment configuration that you used in paper 2? |
Hi, @wesley-stone @Jinyu-W any update on the previous post? It seems to me somehow relevant to provide good configurations within the environment. Also because in these configs no repositioning performs very good while with master branch configurations it seems to me that the optimal solution is very far from 100%. |
@riccardopoiani Sorry for the late reply. Originally I want to reproduce the online LP for CIM and share it with you, but I was busy with some other works :( As for the points in your problem formulation:
You can refer to this semifinished example (for reference only) to get more details. But currently, limited by the predicted value input, still cannot get the optimal solution. As for the change between paper 1 and paper 2, sorry but I don't have any idea about it. As for the details of paper 2, sorry but I'm not the author and I don't know the details either. |
Thank you very much @Jinyu-W. I just had a few tests on the environment and this somehow confirmed me what I was thinking. The summary is the following: I believe that the current global trade configurations are "too hard" to achieve reasonably performances. Currently, all the methods that I have implemented and tested can't get beyond 70% order fulfillment (even in the simplest configuration, that is
is substituted with
This actually helped in reaching very good performances. The fact that this happened even with your implementation, is even a stronger signal. I would like that you consider the possibility that actually the current configurations are "too hard" to be solved with good performances. In this sense I see two options: recover in some way the configurations used in the paper, or manually decrease that hardness as I did. |
Firstly, I want to express my heartfelt thanks for your careful and considerable verification @riccardopoiani. There are 2 things I want to clarify:
|
Thank you very much again @Jinyu-W. However, I would expect at least the repository to contain information on the best results that have been achieved by some method of yours on that environment. If you still consider this not be enough, I will understand. Thanks again for your replies :) EDIT: Let me rephrase it a bit: the main point of having feasible environments (or the ones tested in the papers) is because you have already something to compare yourself to. E.g.: with feasible environments, if your method is good you should have performances close to 100%. If you have the results in the paper, you should have performances close to the ones in the paper. |
I am just getting into this conversation now, and I hope to provide some useful information. @Jinyu-W please correct me if I am wrong. Concerning comparison with the results in the papers, I think one of the problems comes from using different cost functions, not only among the papers, but also between the papers and the current code. This is true for both the LP model and the RL based model (if you are using the default reward function in main, this is way simpler than the one in the papers). @riccardopoiani maybe I missed this in the previous messages above, have you tried to use the Cost function in the paper for the LP solver ? I did not, and it is in my TODO list, just curious. Same thing apply for the RL models. I think that changing environment parameters is not the way to go, although I totally agree that it would be good to have a benchmark to use as a reference. Last question/observation: are you sure that the envs are not feasible? Feasibility of a problem (as a whole) is often independent from the solver we use. What I mean is: say we have a simple supply/demand problem and we need to satisfy all demand as a constraint; if supply << demand, we will never be able to find a feasible solution. Ok I know this is embarrassingly simplistic. Bottom line, I was wondering if there is a way to have a bound on the feasibility of the problem. The provided sample problems are classified for increasing difficulty, based on the order distribution, the noise etc. Despite how hard is to find a good solution, does the problem at least admits a feasible solution? |
Hello @WessZumino, both papers report performances in terms of container shortage (even if they use a different reward function and so on), the final metric is always the same. And that is the kind of performances that I am using as well, even if currently my objective function/reward functions are functions of the environments that are a bit more more complex. (i.e., I am not using
First of all, I apologize for the confusion. With unfeasible envs, I mean that there is no way of obtaining Of course I am not sure of my claims since I am not sure that my implementation is bug free. EDIT: This is "confirmed" in principle also by the fact that if you increase the vessel capacity to infinity, it does not affect the solution that you can find at all (even knowing all the info in advance). |
Hi @riccardopoiani thanks for the clarification, it is very helpful. Concerning your point, I totally agree that the final metric is the same and therefore it is meaningful to compare to it. The part that confuses me is the following, and maybe it is just terminology: if my goal is to compare my result to the results reported in the papers, changing objective/reward doesn't mean I am changing the underlying model ? And here by model I mean what is represented by the objective function. If I think about a physical model, the model is entirely defined by an Hamiltonian H (that is the equivalent of a cost function to minimize); say I am interested in evaluating a certain metric related to H. If I add new terms to H and evaluate the same metric M, can I say I am comparing the same model ? Should I even expect the same result for M coming from different Hs ? In physics the answer is no, but maybe it is fine within this context, I am not sure I understand. Now let us say everything is the same as in the paper: same objective, same training strategy etc. Can I reproduce the results? Maybe you have already done this and I did not understand, so I am sorry in advance if I said something trivial.
Thanks, this is really useful information ! |
Yes @WessZumino, sorry for the misunderstanding; I thought you were mentioning more complex objectives as other papers on ECRs do. You are right that even in this case you would obtain different plans, and therefore different evaluations for M.
I've been trying so far. The problem is that all the global env cfgs are way different from the ones of both papers. |
Description
Add Offline and Online Linear programming agent for CIM scenario
The text was updated successfully, but these errors were encountered: