deeperlib.estimator.aggregation.sota_estimator(query_pool, api, match_term, uniqueid, query_num)[source]¶A method to estimate the aggregation of a search engine’s corpus efficient ——Efficient search engine measurements
| Parameters: |
|
|---|---|
| Returns: | count(*) of the search engine |
deeperlib.estimator.aggregation.stratified_estimator(query_pool, api, match_term, candidate_rate, query_num, layer=5)[source]¶A method to estimate the aggregation of a search engine’s corpus efficient yet unbiased ——Mining a search engine’s corpus: efficient yet unbiased sampling and aggregate estimation
| Parameters: |
|
|---|---|
| Returns: | count(*) of the search engine |
deeperlib.estimator.sampler.sota_sampler(query_pool, api, match_term, top_k, adjustment=1, samplenum=500)[source]¶A method to crawl each document from a search engine’s corpus in the same probability ——Random sampling from a search engine’s index
| Parameters: |
|
|---|---|
| Returns: | A list of sample documents returned by api |