Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sub module conflict error #295

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

StellarisW
Copy link

@StellarisW StellarisW commented Nov 4, 2024

Currently the logic of register sub module to sys.modules may cause conflict,

That's because of the PR #249 which fix #215

def load(path,
         module_name=None,
         include_dirs=None,
         include_dir=None,
         encoding='utf-8'):

    real_module = bool(module_name)
    thrift = parse(path, module_name, include_dirs=include_dirs,
                   include_dir=include_dir, encoding=encoding)
    if threadlocal.incomplete_type:
        fill_incomplete_ttype(thrift, thrift)

    if real_module:
        sys.modules[module_name] = thrift
        sub_modules = thrift.__thrift_meta__["includes"][:]
        while sub_modules:
            module = sub_modules.pop()
            if module not in sys.modules:
                sys.modules[module.__name__] = module
                sub_modules.extend(module.__thrift_meta__["includes"])
    return thrift

For example:

We have thrift files as follows:

thrift files
├── idl
│   ├── main.thrift
│   ├── included1.thrift
│   ├── included2.thrift

If we call load('idl/main.thrift', module_name='idl.main_thrift'), then the following modules will be registered:

  • idl.main_thrift
  • included1_thrift
  • included2_thrift

So, if included1.thrift is a common module name like platform, which is platform.thrift, then module platform will register to sys.modules, so it will cause conflict.

Now I come up with 2 solutions:

  1. Add an option to load(), which will be def load(path, module_name=None, include_dirs=None, include_dir=None, encoding='utf-8'), and let the users to register sub modules manually. (not recommended)

  2. Optimize sub module name. (recommended)

    use absolute path as its module name instead of the base name of the thrift file.

    let's take the example before, then the following modules will be registered:

    • idl.main_thrift
    • idl.included1_thrift
    • idl.included2_thrift

@StellarisW StellarisW closed this Nov 4, 2024
@StellarisW StellarisW reopened this Nov 4, 2024
@StellarisW
Copy link
Author

Please take a review💌
@aisk @ethe

@aisk
Copy link
Member

aisk commented Nov 7, 2024

Sorry, I can't access my development computer these days. I will review it a few days later.

But the test case should be passed before merging.

@StellarisW
Copy link
Author

Sorry, I can't access my development computer these days. I will review it a few days later.

But the test case should be passed before merging.

That's OK, I'll fix the test cases.

@@ -949,3 +960,8 @@ def _get_ttype(inst, default_ttype=None):
if hasattr(inst, '__dict__') and '_ttype' in inst.__dict__:
return inst.__dict__['_ttype']
return default_ttype

def remove_suffix(s, suffix):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use s.replace(suffix, "") is better.

import itertools

from faker import Faker 

SUFFIX = "__SOME_SUFFIX_"

class MyFaker(Faker):
    def suffix_name(self):
        return f"{self.name()}{SUFFIX}"

fake = MyFaker()
data_num = 10000
escape_ratio = 0.5
escape_num = int(data_num * escape_ratio)

suffix_texts = [fake.suffix_name() for _ in range(escape_num)]
normal_texts = [fake.name() for _ in range(data_num - escape_num)]

def test_and_replace():
    for text in itertools.chain(suffix_texts, normal_texts):
        if text.endswith(SUFFIX):
            res = text.replace(SUFFIX, "")
    return res

def test_and_replace_with_slie():
    for text in itertools.chain(suffix_texts, normal_texts):
        if text.endswith(SUFFIX):
            res = text[:-len(SUFFIX)]
    return res

def just_replace():
    for text in itertools.chain(suffix_texts, normal_texts):
        res = text.replace(SUFFIX, "")
    return res


__benchmarks__ = [
    (test_and_replace, just_replace, "just_replace"),
    (test_and_replace, test_and_replace_with_slie, "test_and_replace_with_slie")
]
                                         Benchmarks, repeat=5, number=20                                          
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃                  Benchmark ┃ Min     ┃ Max     ┃ Mean    ┃ Min (+)         ┃ Max (+)         ┃ Mean (+)        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│               just_replace │ 0.043   │ 0.043   │ 0.043   │ 0.021 (2.0x)    │ 0.022 (1.9x)    │ 0.022 (2.0x)    │
│ test_and_replace_with_slie │ 0.041   │ 0.042   │ 0.042   │ 0.040 (1.0x)    │ 0.041 (1.0x)    │ 0.040 (1.0x)    │
└────────────────────────────┴─────────┴─────────┴─────────┴─────────────────┴─────────────────┴─────────────────┘

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the advice, it seems better indeed.

Comment on lines 47 to 50
sub_modules = thrift.__thrift_meta__["sub_modules"][:]
for module in sub_modules:
if module not in sys.modules:
sys.modules[module.__name__] = include_thrift
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sub_modules = thrift.__thrift_meta__["sub_modules"][:]
for module in sub_modules:
if module not in sys.modules:
sys.modules[module.__name__] = include_thrift
lost_modules = [
m for m in thrift.__thrift_meta__["sub_modules"] if m not in sys.modules
]
for module in lost_modules:
sys.modules[module.__name__] = include_thrift

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cannot pickle dump object from child idl file class
3 participants