Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    DYNA-RANK: Efficient calculation and updation of pagerank
    (2008) Kale, M.; Santhi Thilagam, P.S.
    The decision of the ranking of web page is very important in web, as its growing and changing very rapidly. Ranking of the results in a search engine for a query plays crucial role for huge database like Web, where one query can have millions of results. The browsing nature of web will mostly depend on the ranking of the search results. The existing approaches for calculating pagerank values are mostly centralized and the ones which are distributed, are not being used for practical purposes because of the scalability reasons. The centralized approaches considers total web as one graph and they calculate the pagerank values of total graph after certain time period, which takes long execution time and can be in days. In the same way updating the graph also compels to recalculate all the pagerank values of all the pages in the graph. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized pagerank calculation algorithm. Considering the importance of the "Ranking" in searching context, our approach DYNA-RANK, focuses upon efficiently calculating and updating Google's pagerank vector using "peer to peer" system. The changes in the web structure will be handled incrementally amongst the peers. DYNA-RANK produces the relative pagerank on each peer. DYNA-RANK is proven to take less computation time and less number of iterations compared to centralized approach. © 2008 IEEE.
  • Item
    Performance analysis of graph based iterative algorithms on MapReduce framework
    (Institute of Electrical and Electronics Engineers Inc., 2014) Debbarma, A.; Annappa, B.; Mude, R.G.
    In the recent few years, there has been an enormous growth in the amount of digital data that is being produced. Numerous attempts are being made to process this large amount of data in a fast and effective manner. Hadoop MapReduce is one such software framework that has gained popularity in the last few years for distributed computation of Big Data. It provides a scalable, economical and easier way to process massive amounts of data in-parallel on large computing cluster preserving the properties of fault tolerance in a transparent manner. However, Hadoop always stores intermediate results to the local disk for running iterative jobs. As a result, Hadoop usually suffers from long execution runtimes for iterative jobs as it typically pays a high I/O cost, wasting CPU cycles and network bandwidth. This paper analyses the problems of existing Hadoop and compare its performance against iMapReduce and HaLoop for graph based iterative algorithms. HaLoop offers better performance as it stores intermediate results in cache and reuses those data on the next successive iteration. For using cache invariant data (inter-iteration locality) it schedules the tasks onto the same node that might occur in different iterations. © 2014 IEEE.
  • Item
    Effective Information Retrieval, Question Answering and Abstractive Summarization on Large-Scale Biomedical Document Corpora
    (Springer Science and Business Media Deutschland GmbH, 2023) Shenoy, N.; Nayak, P.; Jain, S.; Kamath S․, S.; Sugumaran, V.
    During the COVID-19 pandemic, a concentrated effort was made to collate published literature on SARS-Cov-2 and other coronaviruses for the benefit of the medical community. One such initiative is the COVID-19 Open Research Dataset which contains over 400,000 published research articles. To expedite access to relevant information sources for health workers and researchers, it is vital to design effective information retrieval and information extraction systems. In this article, an IR approach leveraging transformer-based models to enable question-answering and abstractive summarization is presented. Various keyword-based and neural-network-based models are experimented with and incorporated to reduce the search space and determine relevant sentences from the vast corpus for ranked retrieval. For abstractive summarization, candidate sentences are determined using a combination of various standard scoring metrics. Finally, the summary and the user query are utilized for supporting question answering. The proposed model is evaluated based on standard metrics on the standard CovidQA dataset for both natural language and keyword queries. The proposed approach achieved promising performance for both query classes, while outperforming various unsupervised baselines. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.