Plant and animal endemism in the eastern Andean slope: challenges to conservation

Jennifer Swenson

The Andes-Amazon basin in Peru and Bolivia is a biologically rich but understudied area with high endemism and unknown species distributions.

Measuring News Similarity Across Ten U.S. News Sites

Alexander Nwala

The text discusses a method to identify and measure the similarity of top emphasized news stories across various U.S.-based news websites, highlighting the widespread but poorly quantified phenomena of editorial decision-making and story selection.

The Many Shapes of Archive-It

Alexander Nwala

Web archives are crucial for digital preservation, enabling journalists, social scientists, historians, and government organizations to curate and manage their own collections by selecting original resources.

Bootstrapping Web Archive Collections from Social Media

Alexander Nwala

Automatically and semiautomatically generated archived web collections from social media platforms offer a cost-effective alternative to human-curated collections and were analyzed for their similarity to Archive-It collections.

Scraping SERPs for Archival Seeds: It Matters When You Start

Alexander Nwala

The paper investigates how the retrievability of URIs of news stories found on Google changes over time, impacting event-based collection building.

Query-Driven Multimodal GraphRAG: Dynamic Local Knowledge Graph Construction for Online Reasoning

Yi He

AThe proposed Query-Driven Multimodal GraphRAG framework enhances interpretability and reliability of LLMs in complex reasoning tasks by dynamically constructing query-specific local knowledge graphs, excelling in cross-modal understanding and achieving state-of-the-art performance on MultimodalQA and WebQA datasets.

HGDL: Heterogeneous Graph Label Distribution Learning

Heng Lian, Yi He

This paper introduces a novel framework for heterogeneous graph label distribution learning (HGDL) that addresses challenges of node type, attribute, and neighborhood structure heterogeneity using proactive graph topology homogenization and a consistency-aware graph transformer, demonstrating its effectiveness through theoretical and empirical validation.

Learning Gradual Typing Performance

Yi He

Gradual typing, which seeks to merge the benefits of static and dynamic typing, faces challenges with unpredictable performance, prompting efforts to optimize it, though understanding and managing the performance landscape during program migration remains underdeveloped.

Towards Utilitarian Online Learning – A Review of Online Algorithms in Open Feature Space

Yi He

This paper reviews recent advancements in Utilitarian Online Learning (UOL) within open feature spaces, categorizes existing models, assesses their strengths and weaknesses, examines application scenarios, benchmarks model performance, and explores challenges and future research directions.

Generating Virtual Reality Stroke Gesture Data from Out-of-Distribution Desktop Stroke Gesture Data

Jindong Wang

The paper utilizes desktop interaction data to generate VR interaction data, focusing on time-varying stroke gestures to aid user behavior analysis and experience enhancement.

Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Jindong Wang

The paper introduces Robustness Critical Fine-Tuning (RiFT), a method aimed at improving the generalization of deep neural networks while maintaining adversarial robustness, addressing limitations of traditional Adversarial Training.

MetaFed: Federated Learning among Federations with Cyclic Knowledge Distillation for Personalized Healthcare

Jindong Wang

MetaFed is a novel framework for federated learning that enhances model personalization and performance across federations without a central server, using Cyclic Knowledge Distillation to overcome data heterogeneity, and improves accuracy and communication efficiency in healthcare applications.

Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction

Jindong Wang

The paper proposes a Multi-Grained Consistency Regularization (MGCR) method to leverage unlabeled data for improving Target-oriented Opinion Words Extraction (TOWE), addressing training data scarcity and distribution shifts, and demonstrates its effectiveness on benchmark datasets.

Generalizing to Unseen Domains: A Survey on Domain Generalization

Jindong Wang

This paper provides a comprehensive review of recent advances in domain generalization, a field focused on developing models that can generalize to unseen test domains, by defining the concept, categorizing related algorithms, discussing theories, and suggesting future research topics.

AutoRuleSQL: Hybrid Text-to-SQL via Rule-Driven Fast Paths and LLM Bootstrapping

Haipeng Chen

AutoRuleSQL, a hybrid NL2SQL system, enhances real-time query efficiency by combining template-based methods with LLM fallback, reducing latency by over 12.6% and improving accuracy by up to 4.0%.

SEQUENTIAL STOCHASTIC COMBINATORIAL OPTIMIZATION USING HIERARCHICAL REINFORCEMENT LEARNING

Haipeng Chen

This paper introduces a novel hierarchical reinforcement learning framework, wake-sleep option (WS-option), to address sequential stochastic combinatorial optimization problems effectively, demonstrating improved effectiveness, generalizability, and computational efficiency over traditional methods.

CAN REINFORCEMENT LEARNING SOLVE ASYMMETRIC COMBINATORIAL-CONTINUOUS ZERO-SUM GAMES?

Haipeng Chen

The paper introduces and analyzes two-player Asymmetric Combinatorial-Continuous zEro-Sum (ACCES) games, proves Nash equilibrium existence, develops the Combinatorial Continuous DO (CCDO) algorithm to solve them, and presents the CCDORL algorithm based on reinforcement learning, with experiments validating their effectiveness.

Population Aware Diffusion for Time Series Generation

Haipeng Chen

PaD-TS is a new time series generation model designed to preserve population-level properties, like value distributions and cross-correlation, reducing distribution shifts while maintaining individual-level data authenticity, demonstrating substantial improvements over existing models.

Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks

Qingyun Wang

The study explores how Vision-Language Models can self-correct and improve using a Self-Correction Learning approach during fine-tuning, demonstrating enhanced performance without external feedback, unlike during iterative inference.

CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering

Qingyun Wang

This paper investigates the ability of Language Models to align multilingual knowledge, improving cross-lingual question answering and performing effectively in zero-shot and retrieval-augmented contexts.

SCIMON : Scientific Inspiration Machines Optimized for Novelty

Qingyun Wang

The study aims to improve neural language models' capacity to generate innovative scientific ideas from literature by using background contexts instead of traditional binary link prediction, thus enhancing expressivity and novelty.